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^ Modtaget 

Nod-factor perception 
Field of tlie invention 

The invention relates to a novel Nod-factor binding element and component 
polypeptides that are useful In enhancing IMod-factor binding In nodulating 
plants and Inducing nodulatlon In non-nodulating plants. More specifically, 
the Invention relates to Nod-factor binding proteins and their respective 
genomic and mRNA nucleic acid sequences. 

Baclcground of the invention 

The giowlh of agricultural crops is almost always limited by the availability of 
nitrogen, and at least 50% of global needs are met by the application of 
synthetic fertilisers In the form of ammonia, nitrate or urea. Apart from 
recycling of crop residues and animal manure, and atmospheric deposition, 
the other most important source of nitrogen for agrlcufturB comes from 
biological nitrogen fixation. 

A small percentage of prokaryots. the diazotrophs, produce nltrogenases and 
are capable of nitrogen fixation. Members of this group, belonging to the 
Rhizobiaceae family (for example Mesorhlzobium lot}, Rhizoblum meliloti, 
Bradyrhizobiumjaponicum, Rhizobium leguminosarum bv wceae) here 
collectively called Rhizobium or l^izobia spp and the actlnobacterlum 
Franlda spp, can fomn endosymbiotic associations with plants conferring the 
ability to fix nitrogen. Although many plants can associate with nitrogen fixing 
bacteria, only a few plants, all members of the Rosid I Ciade, fbnn 
endosymbiotic associations with Rhizobia spp and Frankia spp., which are 
unique in that most of the nitrogen is transfened to and assimilated by the 
host plant Legumes, including soybean, bean. pea. peanut, chickpea, 
cowpea. lentil, pigeonpea, alfalfa and clover, are the most agronomlcally 
Important members of this small group of nitrogen-fixing plants. 
The rhlroblaHegume Interaction is generally host-strain specific, whereby 
successful symbiotic associations only occur between specific rtilzobial 



strains and a limited number of legume species. The specificity of this 
interaction is determined by chemical signalling between plant and bacteria, 
which accompanies the initial Interaction and the establishment of the 
symbiotic association (Hirsch etat. 2001, Plant Physiol. 127: 1484-1492). 
Specific (lso)flavanolds, secreted Into the soil by legume spp. allow 
Rhizobium spp to distinguish compatible hosts in their proximity and to 
migrate and associate with roots of the host. In a compatible Interaction, the 
(lso)navanold perceived by the Rhizobium spp. interacts with the rhizoblal 
nodD gene product, which In turn leads to the Induction of rhizobial Nod- 
fador synthesis. Nod-factor molecules are lipo-chitin-ollgosaccharldes. 
commonly comprising four or five p-1.4 linked N-acetylglucosamines, with a 
16 to 18 carbon chain fatty acid n-acetyiated on the terminal non-reducing 
sugar. Nod factors are synthesised in a number of variants, characterised by 
their chemically different substitutions on the chitin bacl<bone which are 
distinguished by the compatible host plant. The perx^ption of Nod-factors by 
the host Induces invasion zone root hairs, in the proximity of rhizobial ceils, to 
curl and entrap the bacteria. The adjacent region of the root hair plasma 
membrane Invaginates and new cell wall material is synthesized to form an 
infection thread or tube, which serves to transport the symbiotic bacteria 
through the epidermis to the cortical ceils of the root. Here the cortical cells 
are induced to divide to fomi a primordlum. from which a root nodule 
subsequently develops. In legumes belonging to genera like Arachls 
(peanut), Stylosantos and Sesbania, Infection Is Initiated by a simple "crack 
entry" through spaces or cavities between epWermal cells and lateral roots. 
In spite of these differences, perceptfon of Nod factore by the host plant 
simultaneously Induces the expresston of a series of plant nodulin genes, 
which control the development and function of root nodules, wherein the 
rhizobial endosymblotic association and nitrogen fixation are localised. 
A variety of molecular approaches have identified a series of plant nodulin 
genes which play a role In riilzobial-legume symbiosis, and whose 
expression Is Induced at eariy or later stages of rtiizoblal Infection and nodule 
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development (Geurts and BIssellng. 2002. Plant Cell supplement S239-249). 
Furthermore, plant mutant studies have revealed that a signalling pathvvay 
must be Involved In amplifying and transducing the signal resulting from nod- 
factor perception, which is required for the Induction of nodulln gene 
expression. Among the first physiological events Identified in this signal 
transduction pathway, which occurs circa 1 min after Nod-factor application 
to the root epidennls. is a rapid calcium Influx followed by chloride efflux, 
causing depolarisatlon of the plasma membrane and alitalization of the 
external root hair space of the Invasion zone. A subsequent efflux of 
potassium Ions allows re-polarisation of the membrane, and later a series of 
calcium oscillations are seen to propagate the signal through the root hair 
cell. Phannacological studies with specific drugs, which mimic orbloclc Nod- 
factor induced responses, have identified potential components of the 
signalling pathway. Thus mastoparan, a peptide which is thought to mimic 
the activated intracellular domain of G-protein coupled receptors, can induce 
eariy Nod gene expression and root hair curiing. This suggests that trimeric 
G protein may be Involved in the Nod-factor signal transduction pathway. 
Analysis of a group of nodulation mutants. Including some that fail to show 
calcium oscillations in response to Nod-factor signals, has revealed that in 
20 addition to the lack of nodulation. these mutants are unable to Ibnn 
endosymbloses with artjuscular mycorrtiizal ftingi. This Implies that a 
common symbiotic signal transduction pathway Is shared by two types of 
endosymblotic relationships, namely root nodule symbiosis, which is largely 
restricted to the legume family, and ariiuscular mycorrtiizal symbiosis, which 
is common to the majority of land plant species. This suggests that there may 
be a few key genes which dispose legumes to engage In nodulation. and 
which are missing from crop plants such as cereals. 
The identification of these key genes, whfch encode functions which are 
Indispensable for establishing a nitrogen fixing system in legumes, and their 
transfer and expression In non-nodulating plants, has long been a goal of 
molecular plant breeders. This could have a significant agronomic Impact on 
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the cultivation of cereals such as rice. vA\ere production of two harvests a 
year may require fertilisation with up to 400 kg nitrogen per hectare. In 
accordance with this goal, WO02102841 describes the gene encoding the 
NORK polypeptide. Isolated from the nodulating legume Medicago sativa, 
and the transfomiatlon of this gene into plants Incapable of nitrogen fixation. 
The NORK polypeptide and its homologue/orthologue SYMRK from Lotus 
Japonicus (Stracke ef a/2002 Nature 417:959-962). are transmembrane 
receptor-like kinases with an extracellular domain comprising leucine-rich 
repeats, and an Intracellular protein kinase domain. Lotus Japonicus mutants, 
with a nonfunctional SYMRK gene, fall to fbmi symbiotic relationships with 
either nodulating rhizobia or arbuscular mycon^lza. This implies that a 
common symbtotic signalling pathway mediates these two symbiotic 
relattonshlps, where SYMRK comprises an early step in the pathway. The 
sym/?/C mutants retain an initial response to rhizobial infection, whereby the 
root hairs in the susceptable invasion zone undergo swelling of the root hair 
tip and branching, but fail to curl. This suggests that the SYMRK protein is 
required for an eariy step in the common symbiotic signalling pathway, 
located downstream of the perception and binding of microbial signal 
molecules (e.g. Nod-factors), that leads to the actlvatton of nodulln gene 
20 expression. 

The search for key symbiosis genes has also focussed on 'candWate genes- 
encoding receptor proteins with the potential for perceiving and binding Nod- 
factors or surface structures on rhizoblal bacteria. US 6.465.716 discloses 
NBP46. a Nod-factor binding lectin Isolated from Dolictios biftorus roots, and 
25 its transgenic expresston In transfomied plants. Transgenic expresston of 
NBP46 in plants is reported to confer the ability to bind to specific 
carbohydrates in the rhizobial cell wall and thereby to bind these bacteria and 
utilise atmospheric nitrogen, as well as confening apyrase activity. An 
alternative approach to search for key symbiosis genes has been to screen 
for Nod-fector binding proteins In protein extracts of plant roots. NFBS1 and 
NFBS2 were isolated flrom Medicago trunculata and shown to bind Nod- 
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factors in nanomolar concentrations, however, they both failed to exhibit the 
Nod-factor specificity characteristic of rhizobiaHegume interactions (Geurts 
and Bisseiing. 2002 supra). 

The Nod-faclor binding element, which is responsible for strain specific Nod- 
factor perception Is not. as yet, identified. The isolation and characterisation 
of this element and its respective gene(s) would open the way to introducing 
Nod-factor recognition into non-nodulating plants and thereby the potential to 
establish Rhizobium-hase6 nitrogen fixation in important crop plants. 



Rhizobial strains produce strain-specific Nod-factors, lipochitin 
oligosaccharides (LCOs). which are required for a host-specific interaction 
with their respective legume hosts. Lotus and peas belong to two different 
cross-Inoculation groups, where Lo/os develops nodules after infection with 
1 5 Mesorhizobium loti, while pea develops nodules with Rhizobium 

legumlnosarum bv wceae. Cultivars belonging to a given Lotus sp also vary 
in their ability to interact and form nodules with a given riilzobial strain. 
Perception of Nod-factor secreted by Rhizobium spp bacteria, as the first 
step in nodulation. commonly leads to the initiation of tens or even hundreds 
of rhizobial infection sites In a root. However, the majority of these infections 
abort and only In a few cases do the riilzobla Infect the nodule primordium. 
The frequency and efficiency of the RhizobiumAegumB interaction leading to 
Infection is l<nown to be influenced by variations in Nod-factor structure. The 
genetics of Nod-factor synthesis and modification of their chemical stmcture 
Iri Rhizobium spp have been extensively characterised. An understanding of 
Nod-factor binding and perception, and the stmcture of its component 
elements Is needed In order to optimise the host Nod-fector response. This 
Infbmiatlon would. In turn, provide the necessary tools to breed for enhanced 
efficiency of nodulation and nitrogen fixation In cunrent nitrogen-fixing crops. 
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The importance of this goal Is cleariy illustrated by the performance of the 
major US legume crop, soybean, which Is grown on 1 5%, or more, of 
agricultural land In the US. While nitrogen fixation by soybean root nodules 
can assimilate as much as 100 kg nitrogen per hectare per year, these high 
levels of nitrogen assimilation are Insufficient to support the growth of the 
highest yielding modem soybean cultivars. which still requlm the application 
of fertiliser. 

In summary, there is a need to Increase the efficiency of nodulatlon and 
nitrogen fixation In current legume crops as well as to transfer this ability to 
non-nodulating crops in order to meet the nutritional needs of a growing 
global population, while minimising the future use of nitrogen fertilisers and 
their associated negative environmental impact. 

Summary of the Invention 

The invention provides a Nod-factor binding element comprising one or more 
isolated NFR polypeptides. The NFR polypeptides of the invention are NFR1, 
comprising an amino acid sequence substantially identical to SEQ ID No: 25, 
having specific Nod-factor binding properties, and NFR5 comprising an 
amino acid sequence substantially identical to SEQ ID No: 8. having specific 
Nod-factor binding properties. FurUiennore. the invention provides for the 
Isolation of nucleic acid molecules comprising NFR1 and NFR5 gene and 
cDNA sequences encoding said NFR1 polypeptide and said NFR5 
polypeptide, comprising a nucleic acid sequence substantially identical to 
SEQ ID No: 23 and SEQ ID No: 7. respectively. 

According to a further embodiment of ttie invention, a mettiod is provided for 
producing a plant expressing ttie Nod-factor binding element, the mettiod 
comprising Introducing into the plant a transgenic expression cassette 
comprising a nucleic acid sequence, encoding a NFR polypeptide having 
specific Nod-factor binding properties and having an amino acid sequence 
substantially identical to SEQ ID No: 25 or SEQ ID No:8, respectively, 
wherein tiie nucleic add sequence is operably linked to Its own promoter or a 
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heterologous promoter, preferably a root specific promoter. In a preferred 
embodiment, the expression of both said NFR 1 and NFR5 polypeptides by 
the transgenic plant confers on the plant the ability to bind Nod-factors In a 
chemically specific manner and thereby initiate the establishment of a 
/?/7/zoWum-plant Interaction leading to the development of nitrogen-fixing root 
nodules. 

According to a further embodiment, the Invention provides a method for 
marker assisted breeding of NFR alleles, encoding variant NFR polypeptides, 
comprising the steps of identifying variant NFR1 or NFR5 polypeptides in a 
nodulating legume species, comprising an amino acid sequence substantially 
similar to SEQ ID No: 25 or SEQ ID No: 8 respectively; determining the 
nodulation frequency of plants expressing said variant NRF1 or NFR5 
polypeptide: Identifying DMA polymorphisms at loci genetically linked to or 
within ttie allele locus encoding said variant NFRi or NFR5 locus; preparing 
molecular maricers based on said DNA polymorphisms; and using said 
molecular mart<ers for the identification and selection of plants carrying NFR 
alleles encoding said variant NFR1 or NFR5 polypeptides. The invention 
includes plants selected by the use of this method of maricer assisted 
breeding. In a preferred embodiment, said method of marker assisted 
breeding of NFR alleles provides for Vne breeding legumes with enhanced 
nodulation firequency and nodule occupancy. 



Brief Description of the figures 

Figure 1: Map based cloning of Lotus NFR5. a. Genetic map of the NFRS 
25 region with positions of linked AFLP and microsateilite mari<ers above the 
line and distances In cM below. The iiaction of recombinant plants detected 
In the mapping population is indfcated. b. Physical map of the BAG and TAG 
clones between the closest linked microsateilite mariners. The positions of 
sequence-derived martcers used to fine-map the NFRS locus, and the fraction 
30 of recombinant plants found in the mapping population are indicated, c. 

Gandldate genes identified in the sequenced region delimited by the closest 



8 

linked recombination events, d. Structure of the NFRS gene, position of the 
transcription initiation point and the nfr5-f . nfr5-2 and nfi-S-S mutations. The 
asterisk Indicates the position of a stop codon In nfr5-3; the black triangle a 
retrotransposon Insertton In nfr6-2; and the grey box defines the deletton In 
5 nfrs-l. GGDP: geranylgeranyl diphosphate synthase; RE: retroelement; RZF: 
ring zinc finger protein; GT: glycosyl transferase; A2L: apetala2-like protein; 
I^K: receptor-like kinase; PL: pectate lyase-like protein; AS: ATPase- 
subunlt; HD: homeodomain protein; RF: ring finger protein. Hypothetical 
proteins are not labelled, e. Southern hybridization demonstrating deletion of 
10 SYM10 in the "N15" symlO mutant line. EcoRI digested genomic DNA of the 
parental variety "Sparide" and the fast neutron derived mutant "N15» 
hybridized with a pea SYM10 probe covering the region encoding the 
predicted extracellular domain. Hybridization with a probe from the 
3'untranslated region demonstrated that the complete gene was deleted. 

15 Figure 2: Structure and domains of the NFRS protein, a. Schematic 

representation of the NFRS protein domains, b. The amino ackJ sequence of 
NFRS arranged In protein domains. Bold, conserved LysM resWues. Bold 
and underiined residues conserved in protein kinase domains (KD); TIVI: 
transmembrane, SP: signal peptide. The asterisk Indicates a stop codon in 
20 the nfr5-3; the black triangle a retrotransposon Insertton in nfr5-2 and the 
grey box defines the amino acids deleted in nfirs-l. c. Individual alignment of 
the three LyslVI motifs (M1, M2. M3) from NFRS. pea SYM10. Medicago 
truncatula {M.t. Ac126779) rice (Ac103891). the single LysM in chitinase from 
Volvox carteri (Acc. No: T081 50) and the pfam consensus, d. The divergent 
or absent activation loop (domain VIII) in the NFRS family of receptor kinases 
Is illustrated by alignment of kinase motifs VII, Vlll and IX from Arabldopsis 
(At2g33S80) NFRS. SYM10, Medicago truncatula {M.t, Ac126779). rice 
(AC103891) and the SMART concensus. Conserved domain VII aspartic acM 
Is marked In boW and underiined. c and d the amino acids conserved in all 
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aligned motifs are marked in black and amino ackJs conserved In two or more 
motifs are marked In grey. 

Figure 3. The aligned amino acid sequence of the LjNFRS and PsSYMIO 
proteins. Amino acid residues sharing Identity are highlighted. The Medicago 
truncatula (Ac1 26779) showing 76 % amino acid identy to Lotus NFR5 is 
Included to exemplify a substantial identical protein sequence. 

Figure 4. Steady-state levels of LjNFRS and PsSYMIO mRNA. a. NFR5 
mRNA detected In unlnoculated roots. Inoculated roots, nodules, leaves, 
flowers and pods of Lotus plants, b. Time course of NFR5 mRNA transcript 
accumulation in roots after inoculation with M. hti. The Identity of the 
amplified transcripts was confirmed by sequencing. ATPase was used as 
internal control and relative nomialised values compared to unlnoculated 
roots are shown, c. Northern analysis showing NFR5 mRNA expression in 
nodule leaf and root of symblotically and non-symbiotically grown Lotus 
plants, d. Northern analysis showing Sym10 mRNA expression in leaf, root 
and nodule of symblotically and non-symbtotically grown pea plants. 

Figure 5. Poslttonal cloning of the NFR1 gene. a. Genetic map of the region 
surrounding the NFR1 locus. Positions of the closest AFLP. microsatelltte- 
and PCR-markers are given together with genetic distances In cM. b. 
Physical map of the NFR1 locus. BAG clones 56L2. 16K18, 10M24, 36D15, 
56K22 and TAC clones LjT05B16. LjT02D13, LjT211O02, which cover the 
region are shown. The numbers of recombination events detected with BAG 
and TAG end-markers or internal markers are given. Arrows indicate the 
positions of the two markers (10M24.2. 56L2-2) delimiting the NFR1 kxjus. 
UFD and HP correspond to the UFDI-IIke protein and the hypothetfcal 
protein encoded In the region, c. Exon-lntron stmcture of the NFR1 gene. 
Boxes correspond to exons. where LysM motifs are shown In light grey, 
trans-membrane region In black, kinase domains In dark grey. Dotted lines 
define Introns and full lines define the 5' and 3' un-translated regions. The 
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nucleotide length of all exons and introns are Indicated. The numbers 
between brackets conespond to exon and Intron 4. corresponding to 
alternative splicing. 

Figure 6. Structure and domains of the NFR1 protein, a. Primary structure of 
the NFR1 protein comprising a signal peptide (SP); LysM motifs (LysM1 and 
LyslVI2); transmembrane region (TM); protein kinase domains with conserved 
amino acids in bold and underlined (PK). The cysteine couples (CxC) are in 
bold and the LysM amino acids important for secondary structure 
maintenance are underlined. The two extra amino acids resulting from 
alternative splicing are shown In brackets. I-XI represent the kinase domains. 
Asterisks indicate posittons of the nonsense mutations found in NFRI-I and 
NFR1-2 mutant alleles, b. Alignments of the two NFR1 LysM motife to the 
consensus sequences predicted by the SMART program and the 
Arabidopsis thaliana (Acc No: NP5e6689), rice (O. safiVa) (Acc No: 
BAB89226). and Volvox carteri (Acc. No: T08150) LysM motife. 

Figure 7. NFRI^ NFR5 and SymRK gene expression, a. Transcript levels of 
NFR1 In uninoculated. Inoculated roots, nodules, leaves, flowers and pods of 
wild type plants, b. Transcript levels of NFR1 In wild type. nfrl. nfrS and 
symRK mutant plants after inoculatfon with M. loti. c. Transcript levels of 
NFRS In wild type. nfrl. nfrS and SymRK mutant plants after inoculation with 
M. loti. d. Transcript levels of SYMRK in wild type, nfrl, nfrS and symRK 
mutant plants after Inoculation with M. loti. Transcript levels were measured 
by quantitative PGR. ATPase was used as internal control and relative values 
normalised to the untreated roots (zero hours) are shown. 

Figure 8. Root hair response after Inoculation with M. loti or Nod-factor 
application, a. Wild type root hair curiing on seedlings inoculated with M. ton. 

b. Root hair deformations on wiW type seedlings after Nod-factor application. 

c. Root hairs on nfr1-1 seedlings inoculated with M. loti. d. Root hairs on 
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nfr1-l seedlings after Nod-factor application, e. Root hairs with t>alloon 
deformations on symRK-3 mutants Inoculated with M toft f. I^ts hairs on a 
nfr1-1,symRK-3 double mutant Inoculated with M. lot! g. Excessive root hair 
response on nin mutants inoculated with M. loti. h. Root hairs on a nfrl-l.nfn 
5 double mutant Inoculated with M. loti. Root hairs on nfrS-l seedlings 
inoculated with M. loti, nfirS-l seedlings after Nod-factor application, 
untreated nfr5-1 control, untreated wild type control, untreated nfrl-l control, 
are Indistlguisable from the straight roots hairs shown in c, d, f, h and 
therefore not shown. Inserts to the right of a to h show a close-up of the rx)ot 
10 hairs. 

Figure 9. Membrane depolarisatlon and pH changes in the extracellular root 
hair space after application of Nod-factor purified from M. loti. influence of 0.1 
m Nod-factor (NF) on niembrane potential (Em) and/or external pH (pH) of 
15 a. Lotus wild type b. nfr5-1 and nfr5-2 mutants c. nfr1-1 and nfr1'2 mutants 
d. symRK'1 and symRK-3 mutants e. nfr1-2,symRK-3 double mutant, f. pH 
changes in the extracellular root hair space after application of an 
undecorated chito-octaose. 

20 Figure 10. Expression of the NIN and ENOD2 genes in wild type, nfrl and 
nfrS mutant genotypes, a. NIN transcript level In RNA extracted from roots 
two hours to 12 days after M. loti inoculation, b ENOD2 transcript level In 
RNA extracted from roots two hours to 12 days after M. loti inoculation. 
Transcript levels were measured by quantitative PCR and the identity of the 

25 amplified sequences was conflmied by sequencing. ATPase was used as 
internal control and relative values nomriallsed to the untreated root (zero 
hours) are shown. 
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Figure 11. Alignment NFR1 and NFRS proteins reveal an overall similarity of 
33 % amino acid identities 
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Figure 12. Domain structure of native and hybrid NFR1 and NFR5 
polypeptides. 



Detailed description of tlie invention 

5 

I. Definitions 

AFI-P: Amplified Fragment Length Polymorphism Is a PCR-based technique 
for the amplification of genomic fragments obtained after digestion with two 
different enzymes. Different genotypes can be differentiated based on the 
1 0 size of amplified fragments or by the presence or absence of a specific 
fragment (Vos, P. (1998), Methods MolBloL, 82:147-155). Amplified 
Fragment Length Polymorphism Is a PCR-based technique used to map 
genetic loci. 

Agnbacterium rfi/zogenes-mediated transformation: is a technique used 
IS to obtain transfbnned roots by infection with Agrobacterium rhizogenes. 

During the transformation process the bacteria transfers a Dr4A fragment (T- 
DNA) from an endogenous plasmid into the plant genome (Stougaard, J. et 
al. (1987) Mol.Gen.Genet 207, 251-255). For transfer of a gene of interest 
the gene Is first inserted into the T-DNA region of Agrobacterium rhizogenes 
20 which is subsequently used for wound-site infection. 
Allele: gene variant 

BAC clones: clones from a Bacterial Artificial Chromosome library 
Conservatively modified variant when refening to a polypeptide sequence 
when compared to a second sequence, includes Individual conservative 

25 amino acid substitutions as well as individual deletions, or additions of amino 
acids. Conservative amino acid substitution tables, providing functionally 
similar amino acids are well known in the art When referlng to nucleic acid 
sequences, conservative modified variants are those that encode an Identical 
amino acid sequence, in recognition of the fact that codon redundancy allows 

30 a large number of different sequences to encode any given protein. 
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Contig: a series of overlapping cloned sequences e.g. BACs, co-linear and 
homologous to a region of genomic DNA. 
Exons: protein coding sequences of a gene sequence 
Expression cassette: refers to a nucleic acid sequence, comprising a 
promoter operaWy linked to a second nucleic add sequence containing an 
ORF or gene, which in turn Is operably linlced to a tenninator sequence. 
Heterologous: A polynucleotide sequence is "heterologous to" an organism 
or a second polynucleotide sequence if it originates from a foreign species, or 
from a different gene, or Is modified from its original form. A heterologous 
promoter operably linlced to a coding sequence refers to a promoter from a 
species, different from that from which the coding sequence was derived, or, 
from a gene, different finom that from which the coding sequence was derived. 
Homologue: is a gene or protein with substantial Identity to another gene's 
sequence or another protein's sequence. 

Identity: refers to two nucleic acid or polypeptide sequences that are the 
same or have a specified percentage of nucleic acids of amino acids that are 
the same, when compared and aligned for maximum conBspondence over a 
comparison window, as measured using one of the sequence comparison 
algorithms listed herein, or by manual alignment and visual inspection. When 
percentage of sequence identity is used in reference to proteins, It Is 
recognized that residue positions that are not identical often differ by 
conservative amino acid substitutions, where amino acid residues are 
substituted for amino acid residues with similar chemical properties (e.g.. 
charge or hydrophobldty) and therefore do not change the fonclional 
properties of the molecule. Where sequences differ In conservative 
substitutions, the percent sequence identity may be adjusted upwards to 
account for the conservative nature of the substitution. Typically this Involves 
scoring a consenrative substitution as a partial rather than a full mismatch, 
thus Increasing the percent Identity. Means for making these adjustments are 
well known to those skilled in the art. 
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Introns: are non-coding sequences interrupting protein coding sequences 

witliin a gene sequence. 

LCO: llpocliitin oligosaccliarides. 

Legumes: are memt>ers of the plant Family Fabaceae, and include bean, 
5 pea. soybean, clover, vetch, aHaifa, peanut, pigion pea, chickpea, fababean. 
cowpea, lentil in total approximately 20.000 species. 
Locus: or "loci" refers to the map position of a nucleic acid sequence or gene 
on a genome. 

Marker assisted breeding: the use of DNA polymorphisms as "molecular 
10 markers", (for examples simple sequence repeats (microsatelittes) or single 

nucleotide polymorphism (SNP)) which are found at loci, genetically linked to, 

or within, the NFRI or NFR5 loci, to breed for advantageous NFR alleles. 

IMolecular markers: refer to sites of variation at the DNA sequence level in a 

genome, which commonly do not show themselves in the phenotype, and 
1 5 may be a single nucleotide difference in a gene, or a piece of repetitive DNA. 

Monocotyiedenous cereal: includes, but is not limited to, barley, maize, 

oats, rice. rye. sorghum, and wheat. 

IMutant: a plant or organism with a modified genome sequence resulting in a 
phenotype which differs from the common wild-type phenotype. 
20 Native: as in "native pronwter" refers to a promoter operably linked to its 
homologous coding sequence. 

NFR : refers to NFR genes, in partteular NFR1 and NFR5 genes whtoh 
encode NFRI and NFR5 polypeptMes respectively, and comprise a nucleic 
ackl sequence substantially identical to SEQ ID No: 23 and SEQ ID No: 7, 
25 respectively. 

NFR polypeptides: are polypeptides that are required for Nod-factor binding 
and function as the Nod-factor binding element in nodulating plants. NFR 
polypeptides include the NFR5 polypeptide, having an amino acid sequence 
substantially similar to SEQ ID No: 8. and the NFR1 polypeptide, having an 
30 amino acid sequence substantially similar to SEQ ID No: 25. NFR5 and 

NFRI polypeptides show little sequence homology, but they share a similar 
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domain structure comprising an N-termlnal signal peptide, an extracellular 
domain having 2 or 3 LysM-type motifs, folbwed by a transmembrane 
domain, followed by an Intracellular domain comprising a kinase domain 
characteristic of serineAhreonlne kinases. The extracellular domain of NFR 
proteins is the primary determinant of the specificity of Nod-factor 
recognition, whereby a host plant comprising a given NFG allele will only 
fbmn nodules with one or a limited number of Rhizobium strains. 
Nofthem bicyt analysis: a technique for the quantitative analysis of mRNA 
species In an RNA preparatton. 

Nod-factors: are synthesised by nitrogen-fixing Rhizobium bacteria, which 
fomi symbiotic relationships with specific host plants. They are lipo-chltin- 
oligosaccharides (LCDs), commonly comprising four or five p-1-4 linked N- 
acetylglucosamines. with a 16 to 18 carbon chain fatty acid n-acetylated on 
the temiinal non-reducing sugar. Nod-factors are synthesised in a number of 
chemically modified fomns, which are distinguished by the compatible host 
plant. 

Nod-factor binding element: comprises one or more NFR polypeptides 
present in the roots of nodulating plants, and functions in detecting the 
presence of Nod-factors at the root surface and writhin the root and nodule 
tissues. The NFR polypeptides, which are essential for Nod-factor detectfon. 
comprise the first step in the Nod-factor signalling pathway that triggers the 
development of an infection thread and root nodules. 
Nod-factor binding properties: are a characteristic of NFR1 and NFR5 
polypeptides and are parttoularty associated with the extracellular domain of 
said NFR poIypeptWes, which comprise LysM domains. The binding of Nod- 
fectors by the extracellular donnain of NFR polypeptides Is specific, such the 
NFR polypeptides can distinguish between the strain-specific chemically 
modified forms of Nod-factor. 

Nodulating plant a plant capable of establishing an endosymbiotic 
Rhizobium - plant interaction with a nitrogen-fixing Rhizobium bacterium. 
Including the formation of an infection thread, and the development of root 
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nodules capable of fixing nitrogen. Modulating plants are limited to a few 
plant families, and are particularly found in the Legume family, and they are 
all member of the Rosid 1 clade. 

Non-nodulating plant: a plant which is incapable of establishing an 
endosymbiotic RWzo£>/a/- plant interaction with a nitrogen-fixing RNzobial 
bacterium, and which does not form root nodules capable of fixing nitrogen. 
Operably linked: refers to a functional linkage between a promoter and a 
second sequence, wherein the promoter sequence initiates transcription of 
RNA con^sponding to the second sequence. 

ORF: Open Reading Frame, which defines one of three putative protein 
coding sequences in a D^4A polynucleotide. 

Orthologue: Two homologous genes (or proteins) diverging concunently 
with the organism harbouring them diverged. Orthologues commonly serve 
the same function wittiin the organisms and are most often present in a 
^milar position on the genome. 

PGR: Polymerase Chain Reaction is a technique for the amplification of DMA 
polynucleotides, employing a heat stable DNA polymerase and short 
oligonucleotide primers, which hybridise to the DNA polynucleotide template 
in a sequence specific manner and provide the primer for 5* to 3' DNA 
synthesis. Sequential heating and cooling cycles allow denaturatlon of the 
double-stranded DNA template and sequence-specific annealing of the 
primers, prior to each round of DNA synthesis. PGR Is used to amplify DNA 
polynucleotides employing the following standard protocol or modifications 
thereof: 

PGR amplification is performed in 25 pi reactions containing: 10 mM Tris- 
HGI. pH 8.3 at 25°C; 50 mM KCI; 1 .5 mM MgCI 2; 0.01% gelatin; 0.5 unit Taq 
polymerase and 2.5 pmol of each primer together with template genomic 
DNA (50-100 ng) or cDNA. PGR cycling conditions comprise heating to 94''C 
for 45 seconds, followed by 35 cycles of 94'G for 20 seconds; annealing at 
X'C for 20 seconds (where X is a temperature between 40 and TO'C defined 
by the primer annealing temperature); 72°C for 30 seconds to several 
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minutes (depending on the expected length of the amplification product). The 
last cycle is followed by heating to 7rQ for 2-3 minutes, and terminated by 
incubation at 4*'C. 

Pfam consensus: a consensus sequence derived from a laige collectton of 
5 protein multiple sequence alignments and profile hidden Markov models used 
to identify conserved protein domains (Bateman ef a/.. 2002, Nucleic Acids 
Res. 30: 276-80; and searchable on http7Awww.sanqer.ac.uk/Software/Pfam/ 
and on NCBI at http://www.ncbl.nlm.nlh.qQv/structure/cdd/wrp sb.cqi 
Protein doniain prediction: sequences are analysed by BLAST 

"•^ (w ww.ncbi.nlm.nih.qov/BLAST/^ and PredictProtein (www.embl- 

heldelberg.de/predlctproteln/predictprotein.html). Signal peptides are 
predicted by SignalP v. 1 .1 (www.cbs.dtu.dk/servlces/siqnalPA and 
transmembrane regions are predicted by TIVIHMiy/l v. 2.0 
( www.cbs.dtu.dk/servlces/TMHMM/^ 

15 Polymorphism: refers to "DNA polymorphism" due to nucleotMe sequence 
differences between aligned regions of two nucleic acid sequences. 
Polynucleotide molecule: or "polynucleotide", or "polynucleotide sequence" 
or "nucleic acid sequence" refers to deoxyribonudeotides or ribonucleotKles 
and polymers thereof in either single- or double-stranded forni. The term 

20 encompasses nucleic acMs containing known analogs of natural nucleotides, 
which have similar binding properties as the reference nucleic acid. 
Promoter is an array of nucleic add control sequences that direct 
transcriptton of an operably linked nudeic acid. As used herein, a "plant 
promoter^ is a promoter that flindlons in plants. Promoters include necessary 

25 nudeic add sequences near the start site of transcription, e.g. a TATA box 
element, and optionally indudes distal enhancer or repressor elements, 
which can be located several lOOObp upstream of the transcription start site. 
A tissue specific promoter Is one which specifically regulates expressed In a 
particular cell type or tissue e.g. roots. A "constitutive" promoter is one that is 

30 adive under most environmental and developmental conditions throughout 
the plant. 



18 



RACE/5'RACE/3'RACE: Rapid Amplification of cDNA Ends is a PCR-based 
tectinique for thie amplification of 5' or 3' regions of selected cDNA 
sequences wiiici^ facilitates the generation of fuli-lengtli cDIMAs from mRN A. 
The technique is performed using the following standard protocol or 
modifications thereof: mRNA Is reverse transcribed with RNase H' Reverse 
Transcriptase essentially according to the protocol of Matz et al, (1999) 
Nucleic Acids Research 27: 1558-60 and amplified by PGR essentially 
accorcling to the protocol of Kellogg ef al (1994) Biotechniques 16(6): 1 134-7. 
Real-time PCR: a PCR-based technique for the quantitative analysis of 
mRNA species in an RIMA preparation. The formation of amplified DNA 
producte during PCR cycling is monitored in real-time, u^ng a spedfic 
fluorescent DNA binding-dye and measuring fluorescence emission. 
Sexual cross: refers to the pollination of one plant by another, leading to the 
fusion of gametes and the production of seed. 

SMART consensus: represents the consensus sequence of a partiojlar 
protein domain predicted by the Simple Modular Architecture Research Tool 
database (Schuitz, J. ef al. (1 998>* PNAS 26:95(1 1 ):5857-64) 
Southern hybridisation: RIters canylng nucleic acids (DNA or RNA) are 
prehybridized for 1-2 hours at aS'C with agitation in a buffer containing 7 % 
SDS. 0.26 M Na2HP04, 5 % dextrane-suphate. 1 % BSA and 10|jg/ml 
denatured salmon spemi DNA. Then the denatured, radioactivety labelled 
DNA probe is added to the buffer and hybridization Is canied out over night at 
65*C with agitation. For low stringency, washing is carried out at eS'C wrfth a 
buffer containing 2XSSC, 0.1 % SDS for 20 minutes. For medium stringency, 
washing is continued at 65'C with a buffer containing 1XSSC, 0.1 % SDS for 
2x 20 minutes and for high stringency filters are washed a further 2x 20 
minutes at 65'C in a buffer containing 0.3XSSC, 0.1 % SDS. 
Probe labelling by random priming is performed essentially according to 
Felnberg and Vogelsteln (1983) Anal. Biochem. 132(1). 6-13 
and Felnberg and Vogelsteln (1984) Addendum. Ana/. Biochem. 137(1), 266- 
267 
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Substantially Identical: refers to two nucleic acid or polypeptide sequences 
that have at least 60%, preferably 80%, most preferably 90-95% nucleotide 
or amino acid residue identity when aligned for maximum correspondence 
over a comparison window as measured using one of the sequence 
comparison algorithms given herein, or by manual alignment and visual 
inspection. This definition also refers to the complement of the test sequence 
with respect to its substantial identity to a reference sequence. A comparison 
window refers to any one of the number of contiguous positions in a 
sequence (being anything from between about 20 to about 600, most 
commonly about 100 to about 150) which may be compared to a reference 
sequence of the same number of contiguous positions after the two 
sequences are optimally aligned. Optimal alignment can be achieved using 
computerized implementations of alignment algorithms (e.g.. GAP, BESTFIT. 
FASTA, and TFASTA in the Wisconsin Genetics Software Pacliage, Genetics 
Computer Group. 575 Science Dr., Madison, Wis. USA) or BLAST analyses 
available on the site: (www.ncbi.nlm.nih.gov/ ) 

TAC clones: clones from a Transfbnnation-competent Artificial Chromosome 
library. 

TM marker: is a microsatellite marlcer developed firom a TAC sequence, 
based on sequence differences between Lotus Japonlcus Gifu and MG<20 
genotypes. 

Transgene: refers to a polynucleotide sequence, for example a transgenic 
expression cassette", which is integrated into the genome of a plant by 
means other that a sexual cross, commonly referred to as transfonnation, to 
give a transgenic plant. 

UTR: untranslated region of an mRNA or cDNA sequence. 

Variant: refers to "variant NFR1 or NFR5 polypeptides" encoded by different 

A/FR alleles. 

Wild type: a plant gene, genotype, or phenotype predominating in the wild 
population or in the germpiasm used as standard laboratory stocl<. 
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II. Nod-factor binding 

The present Invention provides a Nod-fector binding element c»mprising one 
or more isolated NFR polypeptides. The isolated NFR polypeptides. NFR1 
and NFR2, as exemplified by SEQ ID No: 25 and SEQ ID No: 8, bind to Nod- 
factors in a chemically-specific manner, distinguishing between the different 
chemically modified fomos of Nod-factors produced by different Rhizobium 
strains. The chemical specificity of Nod-factor binding by NFR1 and NFR5 
polypeptides is located in their extracellular domain, which comprises LysM 
type motifs. The LysM protein motif, first identified in bacterial lysin and 
muramidase enzymes degrading cell wall peptidoglycans. is widespread 
among prolcaryotes and eul<aryotes (Pontig et al. 1999. J Moi Biol.289, 729- 
745; Bateman and Bycroft. 2000. J Moi Biol. 299, 1113-1119). In bacteria it is 
often found in proteins associated with bacterial cell walls or involved in 
pathogenesis and in vivo and in vitro studies of Lactococcus /acffs autolysin 
15 demonstrate that the three LyslVI domains of this protein bind peptidoglycan 
(Steen et al, 2003. J Biol. Chem. April issue). Since both A- and B-type 
peptidoglycans, differing in amino add composition as well as cross-linking 
were bound, it was concluded that autolysin LysM domains binds the N- 
acetyl-glucosamlne-N-acetyl-murelne backbone polymer LysM domains are 
20 flrequently found together with amklase. protease or chitinase motlfe and two 
confimied chltlnases carry LysM domains. One Is the sex pheromone and 
wound-Induced polypeptMe from the alga Volvox carteri that binds and 
degrades chltin in vitro (Amon et al.1998.Plant Cell 10,781-9).The other is a- 
toxln from Kluyveromyces lacb's, that docs onto a yeast cell wall chitin 
receptor (Butler..et al.(1991) Eur J Btochem 199, 483-8). Stmcture-based 
alignment of representative LysM domain sequences have shown a 
pronounced variability among their primary sequence, except the amino 
acids directly involved in maintaining the secondary structure. 
The NFR polypeptides are transmembrane proteins, able to transduce 
signals perceived by the extracellular NFR domain across the membrane to 
the intracellular NFR domain comprising kinase motifs, which serves to 
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couple signal perception to the common symbiotic signalling pathway leading 
to nodule development and nitrogen fixation- 

The methods employed for the practise and understanding of the invention, 
which are described below, involve standard recombinant DNA technology 
that are well-known and commonly employed in the art and available from 
Sambrook et a/., 1989, Molecular Cloning: A laboratory manual. 

111. Isolation of nucleic acid molecules comprising NRF genes and 
cDNAs encoding NFR1 and NFR5 polypeptides and their orthologues. 
The isolation of genes and cDNAs encoding NFR1 and NFR5 polypeptides, 
comprising an amino acid sequence substantially similar to SEQ ID No: 25 or 
SEQ ID No: 8 respectively, may be accomplished by a number of techniques. 
For instance, a BLAST search of a genomic or cDNA sequence bank of a 
desired legume plant species (e,g, soybean, pea or MedicaQO truncatula) can 
Identify test sequences similar to the NFR1 or NFR5 reference sequence, 
based on the smallest sum probability score (P(N)). The (P(N)) score (the 
probability of the match between the test and reference sequence occurring 
by chance) for a '^similar sequence" will be less than about 0.2, more 
preferably less than about 0.01 , and most preferably less than about 0.001 . 
This approach is exemplified by the Medlcago truncatula sequence 
(Ac1 26779) included In Figure 3. Oligonucleotide primers, together with 
PGR, can be used to amplify regions of the test sequence from genomic or 
cDNA of the selected plant species, and a test sequence which is similar to 
the full-length NFR1 and NFR5 gene sequences can be assembled. In the 
case that an appropriate gene bank is not available for the selected plant 
species, oligonucleotide primers, based on NFR1 and NFR5 gene 
sequences, can be used to PGR ampiily similar sequences from genomic or 
cDNA prepared from the selected plant. Alternatively, nucleic acid probes 
based on NFR1 and NFR5 gene sequences can be hybridised to genomic or 
cDNA libraries prepared from the selected plant species using standard 
conditions, in order to identify clones comprising sequences similar to NFR1 
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or NFR5 genes. A nucleic acid sequence in a library, which hybridises to a 
NFR1 or NFR5 gene-specific probe under conditions which include at least 
one wash In 2xSSC at a temperature of at least about e5°C for 20 minutes, is 
potentially a similar sequence to a NFRI or NFR5 gene. A test sequence 
comprising a full-length cDNA sequence similar to NFR1 or WF/?5 cDNAs 
having SEQ ID No: 22 and 22 or SEQ ID no: 6 respectively, can be 
generated by 5' RACE cDf^ synthesis, as described herein. 
The nucleic acid sequence of each test sequence, derived from a selected 
plant species, is detemnined in order to Identify nucleic acid molecules which 
are substantially identical to NFRI or NFR5 genes having SEQ ID No: 23 or 
SEQ ID No: 7Y respectively, or nucleic acid molecules that encode proteins 
whose amino acid sequence Is substantially identical to NFR1 or NFR5. 
having SEQ ID No: 25 or SEQ ID No. 8. respectively. 

IV. Transgenic plants expressing NFRI and/or NFR5 polypeptides 

The polynucleotide molecules of the invention can be used to express a Nod- 
factor binding element in non-nodulating plants and thereby confer the ability 
to bind Nod-factors and establish a Rhizobium/plant interaction leading to 
nodule development. An expression cassette comprising a nucleic acid 
sequence encoding a NFR polypeptide, substantially identical to SEQ ID No: 
25 or SEQ ID No: 8. and operably linked to its own promoter or a 
heterologous promoter and 3' terminator can be transfomied Into a selected 
host plant using a number of known methods for plant transformation. By way 
of example, the expression cassette can be cloned between the T-DNA 
borders of a binary vector, and transfenedinto an Agrobacterium 
tumerfyd&ns host, and used to infect and transform a host plant. The 
expression cassette is commonly integrated into the host plant in parallel with 
a selectable mariner gene giving resistance to an hertjicide or antibiotic, in 
order to select transformed plant tissue. Stable integration of the expression 
cassette into the host plant genome is mediated by the virulence functions of 
the Agrobacterium host. Binary vectors and Agrobacterium tumefaciens- 
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based methods for the stable Integration of expression cassettes into all 
major cereal plants are known, as described for example for rice (HIel et al.. 
1994. The Plant J. 6: 271-282) and malze(Yujl etal., 1996, Nature 
Biotechnology, 14: 745-750). Alternative transfomiatlon methods, based on 
direct transfer can also be employed to stably Integrate expression cassettes 
into the genome of a host plant, as described by IVIiki et al., 1993. "Procedure 
for Introducing foreign DNA Into plants". In: Methods in Plant Molecular 
Biology and Biotechnology, Glick and Thompson, eds., CRC Press. Inc.. 
Boca Raton, pp 67-88). Promoters to be used in the expression cassette of 
the Invention include constitutive promoters, as for example the 35S CaMV 
promoter ((AccV00141 andJO^Odft) or in the case or a cereal host plant the 
Ubi1 gene promoter (Christensen et ai. 1992. Plant Mol Biol 18: 675-689). In 
a preferred embodiment, a root specific promoter is used in the expression 
cassette, for example the maize zmGRP3 promoter (Goodemeir et al. 1998, 
15 Plant Mol Biol, 36. 799.802) or the epidennis expressed maize promoter 
described by Ponce et al. 2000. Planta. 211. 23-33. Temilnatore that may be 
used in the expression construct can for instance be the NOS temilnator (Acc 
NC_003065). 

Host plants transfomied with an expression cassette encoding one NFR 
^ 20 polypeptWe. for example NFR1 . or its orthologue, can be crossed with a 

second host plant transfomied with an expresston cassette encoding a 
second NFR polypeptkJe. for example NFR. or its orthotogue. Progeny 
expressing both saki NFR polypeptides can then be selected and used In the 
Invention. Alternatively, host plants can be transformed with a vector 
comprising two expression cassettes encoding both saM NFR polypeptides. 

V. NFR genes encoding NFR polypeptide having specific Nod-factor 
binding properties. 
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Nucleic acid molecules comprising NFR1 or NFR5 genes encoding NFR 
polypeptides having specific Nod-factor binding properties can be identified 
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by a number of functional assays described in the "Examples" given lierein. 
In a preferred embodiment, said nucleic acid sequences are expressed 
transgenically in a host plant employing the expression cassettes described 
above. Expression of NFR1 or NFR5 genes or their homologous/orthologues 
in plant roots allows the specific Nod-factor binding properties of the 
expressed NFR protein to be fully tested. Assays suitable for establishing 
specific Nod-factor binding include the detection of: a morphological root hair 
response (e.g. root hair deformation, root hair curling); a physiological 
response (e.g. root hair membrane depoiarisation, ion fluxes, pH changes 
and calcium oscillations); a symbiotic signalling response (e.g. downstream 
activation of symbiotic noduiin gene expression) following root infection with 
Rhizobium bacteria or isolated Nod^factors; the ability to develop root nodule 
prlmordia. Infection pockets or root nodules, where the response is strain 
dependent or dependent on the chemical modification of Nod-factor 
structure. 

VI. Marker assisted breeding for NFR alleles. 

A method for marker assisted breeding of NFR alleles, encoding variant NFR 
polypeptides, is described herein, with examples from Lotus NFR alleles. In 
summary, variant NFR1 or NFR5 polypeptides, comprising an amino acid 
sequence substantially similar to SEQ ID No: 25 or SEQ ID No: 8 
respectively, are identified in a nodulating legume species, and the 
Rhizobium strain specificity of said variant NRF1 or NFR5 polypeptide is 
detennined, according to measurable morphological or physiological 
parameters described herein. Subsequently, DNA polymorphisms at loci 
genetically linked to, or within, the gene locus encoding said variant NFR1 or 
NFR5 polypeptide, are identified on the basis of the nucleic acid sequence of 
the loci or its neight>ouring DNA region. Molecular markers based on said 
DNA polymorphisms, are used for the identification and selection of plants 
carrying NFR alleles encoding said variant NFR1 or NFR5 polypeptides. Use 
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of this method provides a powerful tool for the breeding of legumes with 
enhanced nodulation frequency. 



III. Examples 
Example 1. 

Cloning of Nod-factor Binding Element Genes 

Genetic studies in the legume plants Lotus japonicus {Q) and pea (Ps) have 
generated collections of symbiotic mutants, which have been screened for 
mutants blocked in the early steps of symbiosis (Geurts and Bisseling. 2002 
supra; Klstnerand Pamiske 2002 Trends in Plant Science 7: 511-518). 
Characteristic for a group of the selected mutants is their InabiHty to respond 
to Nod-factors, with the absence of root hair defomiation and curling, cortical 
cell diviston to fomn the cortical primordlum. and induction of the early nodulln 
genes which contribute to nodule development and function. Nod-factor 
induced calcium oscillations were also found to be absent in some of these 
mutants, Indicating that they are blocked in an early step in Nod-factor 
signalling. Among this latter group, are a few mutants. Including members of 
the PssymlO complementation group and LjNFRI and tJNFRS (previously 
called Qsyml and 5), which failed to respond to Nod-factors but retain their 
abllrty to establish mycorrhizal associations. Genetic mapping indicates that 
pea SYM10 and Lotus NFR5 loci in the pea and Lotus could be orthologs. 
Mutants felling within this group provided a useful starting point in the search 
for genes encoding potential candidate proteins Involved in Nod-factor 
binding and perception. 

A. isolation, cloning and characterisation of NFR5 genes and gene 
products. 

1. Aflap based cloning of QNFRS 
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The symbiotic mutants of Lotus japonlcus nfr5-1, nfr5-2 and nfr5'3 (also 
known as symS), (previously isolated by Sct^auser etal 1998 Mo. Gen Genet, 
259: 414-423; Szczglowski et el 1998. Mol Plant-Microbe Interact, 1 1 : 684- 
697) were utilised. To detennine the root nodulation phenotype under 
symbiotic conditions, seeds were surface sterilised in 2% hyperchlorite, 
washed and inoculated with a two day old culture of M. lot! NZP2235. Plants 
were cultivated In the nitrogen-free B&D nutrients and scored after 6-7 weeks 
(Broughton and Dilworth, Biochem J, 1971, 125, 1075-1080; Handberg and 
Stougaard, Plant J. 1992, 2,487-496). Undernon-symbiotic conditions, plants 
were cultivated in Homum nutrients (Handberg and Stougaard, Plant J. 1S92, 
2,487-496). 

Mapping populations were established in order to localise the nfirS locus on 
the Lotus japonicus genome. Both intra- and interspecific F2 mapping 
populations were created by crossing a Lotus japonicus "Gifu" nfirS-l mutant 
to wild type Lotus japonicus ecotype "MG20" and to wild type Lotus 
filicaulis. MG-20 seeds are obtainable from Sachiko ISOBE, National 
Agricultural Research Center for Hokkaido Region, Hitsujigaoka, Toyohira, 
Sapporo Hokkaido 062-8555, JAPAN and L. ttlicaulis from Jens Stougaard, 
Department of Molecular Biology, University of Aarhus, Gustav Wieds Vej 10, 
DK-8000 Aarhus C. F2 plants homozygous for the nfr5-f mutant allele were 
identified after screening for the non-noduiatbn mutant phenotype. 240 
homozygous F2 mutant plants were analysed in the L fillcaulis mapping 
population and 368 homozygous F2 mutant plants in the "MG20'' mapping 
population. 

Positional cloning of the nfrS locus was performed by AFLP and Bulked 
Segregant Analysis of the mapping populations using the EcoR\IMse\ 
restriction enzyme combination (Vos et al, 1995, Nucleic Acids Res.23, 4407- 
4414; Sandal etal 2002, Genetics, 161, 1673-1683). Initially, nA^Swas 
mapped to the lower anm of chromosome 2 between AFLP markers E33M40- 
22F and E32M54-12F in the L filicaulis based mapping population, as 
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shown In Figure la . The E32M54-12F marker was cloned and used to 
Isolate BAG clones BAC8H12 and BAC67I22 and TAG clone LjT18J10. as 
shown In Figure 1b. The ends of this contig were used to Isolate adjacent 
BAG and TAG clones namely BAG58K7 and LjT01G03 at one end and TAG 
5 LjB06D23 on the other end. The outer end of LjB06D23 was used to isolate 
TAG clone LJT13I23. The outer end of LjB06D23 was used to isolate TAG 
clone LjT13l23 (TM0522). Various maricers from this contig were mapped on 
the mapping populations from nfr&-1 crossed to L fificaulis and to L 
japonlcus MG-20. In the L fillcaulis mapping population one recombinant 

10 plant was found with the outer end of the TAG clone TM0522. whereas no 
recombinant plants were found with a marker from the middle of this TAG 
clone. In the L. japonicus MG-20 mapping population. 4 recombinant plants 
out of 368 plants were found with the marker TM0323, thereby delimiting nfrS 
to a region of 150 kb. This region was sequenced and found to contain 13 

1 5 ORFs, of which two encoded putative proteins sharing sequence homotogy 
to receptor kinases. Sequencing of these two specific ORFs In genomte DNA 
derived from nfr5-1 showed that one of the ORF sequences contained a 27 
nucleotide deletion. Furthermore sequencing of this ORF In genomic DNA 
fi^om nfr5'2 and nfr&-3 showed the Insertion of a retrotransposon and a point 

20 mutation leading to a premature stop codon. respectively, as shown In Figure 
Id. The locallsatton of the nfrS locus from physical and genetic mapping data, 
combined wltti the Identification of mutations In three Independent nfr5 
mutant alleles, provides unequivocal evidence that mutattons in the NFRS 
ORF lead to a loss of NkxJ-iBctor perception. 

25 2. Cloning the IJ NFRS cDNA 

A full-length cDNA corresponding to the NFRS gene was isolated using a 
combination of 5'and 3' RACE. RNA was extracted from Lotus japonicus 
roots, grown in the absence of nitrate or rhizobla. and reverse transcribed to 
make a full-length cDNA pool for the perfonnance of 5'-RAGE according to 
30 the standard protocol. The cDNA was amplified using the 5' oligonucleotide 
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5'CTAATACGACTCACTATAGGGCAAGCAGTGGTAACAACGCAGAGT3' 
(SEQ ID No:1) and the reverse primer 

5'GCTAGTTAAAAATGTAATAGTAACCACGC3' (SEQ ID No: 2), and a 
RACE-product of approximately 2 kb was cloned Into a topolsomerase 
activated plasmid vector (Shuman, 1994. J Biol Chem 269: 32678-32684). 3'- 
RACE was perfonned on the same 5*-RACE cDNA pool, using a 5' gene- 
specrfic primer 5' AAAGCAGCATTCATCTTCTGG 3' (SEQ ID No: 3) and an 
ollgo-dT printer 5'GACCACGCGTATCGATGTCGA C I I I i M I III I I I i I I V 
3' (SEQ ID No: 4). where the first 5 PGR cycles were carried out at an 
annealing temperature of 42" C and the following 30 cycles at higher 
annealing temperature of SS^'C. The products of this PGR reaction were used 
as template for a second PGR reaction with a gene-specific primer positioned 
furthers* having the sequence 5' GGAAGGGAAGGTAATTGAG 3' (SEQ ID 
No: 5) and the above oligo dT-primer, using standard PGR amplification 
conditions (annealing at 54<> G; extension 72'* G for 30 s) and the products 
cloned into a topolsomerase activated plasmid vector (Shuman. 1994. 
supra). Nucleotide sequencing of 18 5'RAGE clones and three 3' RACE 
clones allowed the full-length sequence of the NFR5 cDNA to be detennined 
(SEQ ID No: 6). The NFR5 cDNA was 2283 nucleotides In length, with an 
open reading frame of 1785 nucleotides, preceded by a 5' UTRi leader 
sequence of 140 nucleotides and a 3'UTR region of 358 nucleotides. 
Alignment of the NFR5 cDNA sequence with the NFR5 gene sequence 
(SEQ ID 1^: 7). shown schematically In Figure Id, confimied that the gene Is 
devoid of introns. 

3. Primary sequence and structural domains of gNFRS and mutant 
alleles. 

The primary sequence and domain structure of NFR5, encoded by NFR5, are 
consistent with a transmembrane Nod-factor binding protein, required for 
Nod-factor perception in rhizobial-legume symbiosis. The NFRSgene 
encodes an NFR5 protein of 596 amino acids having the sequence given in 
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Figure 2b (SEQ ID No: 8) and a predicted molecular mass of 65.3 kD. The 
protein domain structure predicted for NFR5 and shown In Figure 2a.b. 
defines a signal peptide, comprising a hydrophobic stretch of 26 amino acids, 
followed by an extracellular domain with three LysM-type motifs, a 
transmembrane domain and an intracellular Idnase domain. The Lysi\4-type 
motifs found In Lotus NFR5. SYM10, Med/cago truncatula (M.t, Ac126779), 
and by homology In a rice gene (Ac1 03891). show homology to the single 
LysM motif present in an algal (Volvox carteria) chitinase (Amon et a/, 1998, 
Plant Ce// 10: 781-789) and to the Pfam consensus, as illustrated In the 
amino acid sequence alignment of this domain given in Figure 2c. The NFR5 
kinase domain has motifs characteristic of functional serine/threonine kinases 
(Schenk and Snaar-Jagalska. 1999. Biochim BiophysActa 1449: 1-24; Huse 
and Kurlyan. 2002. Cell 109: 275-282), with the exception that motif VII lacks 
an aspartic acid residue conserved In kinases, and motif VIII, comprising the 
activation loop, is either divergent or absent. 

Analysis of the nfrS mutant genes reveals that the point mutation In nfr5-3 
and the retrotransposon insertion In nfr5-2 will express taincated 
polypeptides of 54 amino acids, lacking the LysM motlfe and entire kinase 
domain; or of 233 amino acWs, lacking the kinase motlfe X and XI, 
respectively. The 27 nucleotide deletton In the nfrS-l mutant removes 9 
amino acids from kinase motif V. 



4. Cloning and characterisation of the pea SYM10 gene and cDNA and 
symlO mutants. 

Wild type pea cv's (Alaska. Finale. Frisson. Sparkle) and the symbiotic 
mutants (N15: P5; P56) were obtained from the pea germ-plasm collection at 
JIC Nonwlch-UK. while the symbiotic mutant. RisFixG, was obtained from 
KJeld Engvild, Ris0 National Laboratory. 8000 Roskilde. Denmark . The 
mutants, belonging to the pea sym10 complementation group, were identified 
in the following genetic backgrounds: N15 type strain In a SparWe 
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background (Kneen etal. 1994, J Heredity 65: 129-133), P5 In a Frisson 
background (Due and Messager. 1989. Plant Science 60: 207-213), RfsFlxG 
In a RnalG background RIsFixG (Engvlld,1987, Theoretical Applied Genetics 
74: 71 1-713; Borlsov et al., 2000, Czech Journal Genetics and Plant 
Breeding 36: 106-110); P56 In a Frisson background (Sagan et al.1994. Plant 
Scrence 100: 59-70). 

A fragment of the pea SYMW gene was cloned by PCR amplification of cv 
Finale genomic DNA using a standard PCR cycling program and the fonward 
primer 5'-ATGTCTGCCTTCTTTCTTCCTTC-3'. (SEQ ID No: 9) and the 
reverse primer 5'-CCACACATAAGTAATMAGATACT-3'. (SEQ ID No: 10). 
The sequence of these oligonucleotide primers was based on nucleotWe 
sequence stretches conserved in L. Japonicus NFR5 and the partial 
sequence of an NFR5 homologue identified in a M. truncatula root EST 
collection (BE204912). The Wentity of the amplified 551 base pair S/WfO 
product was confinmed by sequencing, and then used as a probe to isolate 
and sequence a pea cv Alaska SYMIO genomic clone (SEQ ID No:11 ) from 
a cv. Alaska genomic library (obtained from H. Franssen, Department of 
Molecular Biology, Agricultural University, 6703 HA Wagenlngen. The 
Nethertands) and a full-length pea cv. Finale SYMIO cDNA clone (SEQ ID 
No: 12) from a cv. Finale cDNA library (obtained ftom H. Franssen, supm), 
which were then sequenced. The sequence of the S YMt 0 gene in cv. 
Frisson (SEQ ID No:13) and in cv. Sparide (SEQ ID No: 14) were determined 
by a PCR amplification and sequencing of the amplified gene firagment. The 
nucleotUJe sequence of the corresponding mutants P5. P56. and RisFixG 
were also detemriined by a PCR amplification and sequencing of the 
amplified gene fifagment. 



Nucleotide sequence comparison of the SYI^IO gene in the PssymlO mutant 
lines (P5, RisFix6 and P56) with the wild type parent lines revealed. In each 
case, sequence mutations, which could be con-elated with the mutant 
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phenotype. The 3 Independent symlO ntutant lines identified 3 mutant 
alleles of the SYM10 gene, all canrylng nonsense mutations, and the N15 
type strain was deleted for SYM10 (Table 1. Figure 4c). Southern 
hybridization with probes covering either the extracellular domain of SYM10 
or the 3'UTR on EcoRI digested DNA from N1 5 and the parent variety 
Sparlde. shows that the SYM10 gene is absent from the N15 mutant line. 

5. Primary sequence and structural domains of PsSYMIO and mutant 
alleles. 

The PsSYI^IO protein of pea. encoded by PsSYMIO. is a homologue of the 
NFR5 transmembrane Nod-factor binding protein of Lotus, required for Nod- 
factor perception in riiizobiai-legume symbiosis. The pea cv Alasl<a SYM10 
gene encodes a SYIV110 protein (SEQ ID No: 15) of 594 amino acid .residues, 
with a predicted molecular mass of 66 kD, which shares 73% amino acid 
identity with the NFR5 protein from Lotus. In common with the NFR5 protein, 
the SYI^IO protein has an N-temiinal signal peptide, an extracellular region 
with three LysM motifs, followed by a transmembrane domain, and then an 
intracellular donrain comprising kinase motifs (Figure 2 and 3). 
The symlO genes in the symbiotic pea mutants P5. RlsFlx6 and P56, each 
having premature stop codons. encode truncated SYM10 proteins of 199, 
387 and 404 amino acfcls, respectively, whfch lack part of. or the entire, 
kinase domain (Table 1). 

6. The NFR5 protein family Is unique to nodulating plants 

Comparative analysis defines LjNFRS and PsSYMIO as members of a novel 
family of transmembrane Nod-factor binding proteins. A BLAST search of 
plant gene sequences suggests that genes encoding related, but presently 
uncharacterised. proteins may be present in the legume Medicago truncatula 
(Ac1 26779), while more distantly related, predicted proteins may be found in 
rice (AC103891 ) and Arabidopsis (At2g33580). with a sequence identity to 
NFR5 of 61%. 39%. and 28%. respectively. The high level of sequence 
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conservation in M. truncatula (Acl 26779) maizes this protein and the gene 
encoding the protein substantially identical to NFR5. In common with the 
NFR5 and SYM10. the kinase domains of these proteins also lack the 
conserved aspartic acid resMue of motif VII, and the activation loop in motif 
VIII is highly diverged or absent, as shown in Figure 2d, with the exception of 
the Arabidopsis protein. Only distantly related proteins are therefore found 
outside the legume family. In conclusion, the NFR5 protein family appears to 
be restricted to nodulating legumes, and its absence from other plant families 
may be a key limiting factor in the establishment of rhizobial-root Interactions 
in the members of the families. 

7. Tissue specific expression of the LJNFR5 and PsSYMIO genes 

The expression pattern of the NFR5 and SYMYO genes in Lotus and pea Is 
consistent with the role of their gene products as transmembrane Nod-factor 
binding proteins In the perception of rhizobial NkxJ-factors at the root surface, 
and later during tissue invasion. 

The expression of the NFR5 and SYM10 genes In various isolated organs of 
Lotus and pea plants, was investigated by detemnining the steady state 
NFR5 and SYM10 mRNA levels using Real-time PGR and/or Northern blot 
analysis. Total RNA was isolated from root, leaf, flower, pod and nodule 
tissues of uninoculated or inoculated Lotus "Gifu" or pea plants using a high 
salt extraction buffer followed by purification through a CsCI cushion. For 
Northern analysis, according to standard protocols, 20 \tg total RNA was 
size-fractionated on 1.2% agarose gel, transfemed to a Hybond membrane, 
hybridised overnight with an NFR5 or SYM10 specific probe covering the 
extracellular domain and washed at high stringency. Hybridization to the 
constltutively expressed ubiquitin UBI gene was used as control for RNA 
loading and quality of the RNA. 

For the quantitative real-time RT-PCR, total RNA was extracted using the 
CsCI method and the mRNA was purified by biomagnetic affinity separation 
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(Jakobsen, K.S. efa/(1990) Nucleic Acids Research 18(12): 3669), The RNA 
preparattons were analysed for contaminating DMA by quantitative PCR and 
when necessary, the RNA was treated with DNasel. The DNasel enzyme 
was then removed by phenoUchlorofomi extraction and the RNA was 
precipitated and re-suspended In 20 ^1 RNase free H2O. First strand cDNA 
was prepared using Expand reverse transcriptase and the quantitative real- 
time PCR was performed on a standard PCR LightCycler instrument. The 
efficiency^con-ected relative transcript concentration was determined and 
nomialized to a calibrator sample, using Lotus Japonicus ATP synthase gene 
as a reference (Gerard C.J. etal. 2000 Mol. Diagnosis 5: 39-45). 

The level of NFR5 mRNA, detemnined by Northern blot analysis and 
quantitative RT-PCR, was 60 to 120 fold higher In tiie root tissue of Lotus 
plants in comparison to otiier plant tissues (leaves, stems, flowers, pods, and 
nodules), as shown In Figure 4a. Northern hybridisation show highest 
expression of NFR5 in Lotus root tissue and a barely detectable expression 
in nodules. Northern blot analysis detected SV/WtO mRNA in the roots of pea, 
and a higher level in nodules, but no mRNA was detected in leaves, as 
shown in Figure 4c. 

B. Isolation, cloning and characterisation of NFR1 genes and gene 
products. 

1. Map based cloning of LJ NFR1 

The NFR1 gene was Isolated using a positional cloning approach. On tiie 
genetic map of Lotus tiie NFR1 locus Is located on ttie short ami of 
chromosome I, approximately 22 cM from the top. within a 7,6 cM interval, as 
shown In Figure 5a. Several TM maricers and PCR markers, derived from 
DNA polymorphism In ttie genome sequences of ttie L japonicus mapping 
parents, were found to be closely linked to NFR1 locus and were used to 
nanow down the region. A physical map of tiie region, comprising a contig of 
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assemWed BAG and TAC clones, is shown in figure 5b. Fine mapping in an 
F2 population, established from a Lotus Japonlcus nfr-1 mutant to wild type L. 
Japonlcus ecotype 'Mlyakojima MG-20' cross, and genotyping of 1603 mutant 
plants, identified two markers (56K22, 5612^2) delimiting the NFR1 locus 
within a regton of 250 kb. BAG and TAG libraries, available from Satoshi 
Tabata, Kazusa DNA Research Institute, Kisarazu, Chiba 292-0812 Japan; 
another BAG library from Jens Stougaard. Department of Molecular Biology. 
University of Aarhus. Gustav Wieds Vej 10. DK-8000 Aarhus C, were 
screened using the closest flanking markers (56L2-1,10M24-1. 36D15) as 
probes, and the NFR1 locus was localised to 36 kb within the region. The 
ORFs detected within the region coded for a UFD1-llke protein, a 
hypothetical protein and a candidate NFR1 protein showing homology to 
receptor kinases, (Figure 5b). 

The region in the genomes of nfrl-l, nfr1-2 mutants, corresponding to the 
candidate NFR1 gene was amplified as three fragments by PGR under 
standard conditions and sequenced. The fragment of 1827 bp amplified using 
PGR foro/ard primer STGG ATT TGG ATG GAG AAG G3', (SEQ ID No: 16) 
and reverse primer 5' TTT GGT GTG AGA TTA TGA GG3', (SEQ ID No: 17) 
contains single nucleotide substitutions leading to transiational stop codons 
in both the mutant alleles rffrl-1, with a GAA to TAA substitution, and the 
nfin-2, with a GAA to TAA substitution. The physical and genetic mapping of 
the ntrl locus, combined with tiie Identification of mutations in two 
Independent /7fry mutant alleles, provides unequivocal evidence that the 
sequenced NFR1 gene is required for Nod-factor perception and subsequent 
signal transduction. 

2. Cloning the LJ NFR1 cONAs 

Two alternatively spliced Lj NFR1 cDNAs were identified using a combination 
of cDNA library screening and 5' RAGE on root RNA from Lotus japonicus. A 
Lotus root cDNA library (Poulsen etal., 2002, MPH/Jt 15:376-379) was 
screened with an NFR1 gene probe generated by PGR amplification of ttie 
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nucleotides between 9689 to 10055 of the genomic sequence, using the 
primer pair 5' TTGCAGATTGCACAACTAGG3' (SEQ ID No- 18) and 
5'ACTrAGAATCTGCAACTTTGC 3' (SEQ ID No: 19). Total RNA extracted 
from Lotus roots, was amplified by 5' RACE, according to the standard 
protocol, using the gene specific reverse primer 
5'ACTTAGAATCTGCAACTTTGC 3' (SEQ ID No 20). Based on the 
sequence of isolated NFRI cDNAs and 5' RACE products, the NFR1 gene 
produces two mRNA species, of 2187 (SEQ ID No: 21) and 2193 nucleotides 
(SEQ ID No: 22). with a 5' leader sequence of 114 nucleotides, and a 3' 
untranslated region Is 207 nucleotides (Figure 5c). Alignment of genomic and 
cDNA sequences defined 12 exons in NFRI and a gene stnicture spanning 
10235 bp (SEQ ID No: 23). The sequenced region includes 4057bp from the 
stop codon of the previous gene up to the transcription start point of NFRI + 
6009 bp of NFRI + 1 87 bp of 3'genomlc. Altematlve splice donor sites at the 
3 of exon IV account for the two alternative NFRI mRNA species. 

3. Prlmaiy sequence and structural domains of LJNFR1 and mutant 
alleles. 

The primary sequence and domain stmcture of NFR1 , encoded by LjNFRI 
are consistent with a transmembrane Nod-factor binding protein, required for 
Nod-factor perception in Rhizobium-legume symbiosis. The altematlvely 
spliced NFRI CDNAs encode NFRI proteins of 621 (SEQ ID No: 24) and 623 
amino acids (SEQ ID No: 25). with a predicted molecular mass of 68.09 kd 
and 68.23 kd. respectively. The protein has an amlno-tenninal signal peptide, 
followed by an extracellular domain having two LysM-type motifs, a 
transmembrane domain, and an Intracellular carboxy-tenninal domain 
comprising serine/threonine kinases motifs 

In niri.i, a stop codon in kinase domain VIII encodes truncated polypeptides 
of 490 and 492 amino acids, and in a stop codon between domain IX 
and XI encodes tnjncated polypeptides of 526 and 528 amino acids, as 
indicated in Figure 6a. 
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In Figure 6b the M1 LysM moW of NFR1 is aligned witli the LysM motifs from 
Arabldopsis thaliana and the SMART consensus and M2 LysM of NFR1 with 
the Volvox carter! chitlnase (Ago No: T08150). the closest related Arabidopsls 
thaliana receptor kinase (Acc No: NP_566689) . the rice (Acc No: 
5 BAB89226) and the consensus SMART LysM motif. 

4. The gNFRI protein family is not found In non-nodulating plants 
Comparative analysis defines LjNFRl as a member of a second novel family 
of transmembrane Nod-factor binding proteins. Although proteins having both 

1 0 receptor-like kinase domains and LysM motifs are predicted from plant 
genome sequences, their homotogy to NFR1 is low and their putative 
function unknown. Arabidopsls has five predicted receptor-like kinases with 
LysM motifs in the extracellular domain, and one of them (At3g21630) Is 54% 
identical to NFR1 at the protein level. Rice has 2 genes in the same class. 

1 5 and one (BAB89226) encodes a protein with 32 % identity to NFR1 . 

This suggests that the NFR1 protein is essential for Nod-factor perception 
and its absence from non-nodulating plants may be a key limiUng factor in the 
establishment of rhizobial-root interactions in these plants. Although NFR1 
shares the same domain stmcture to NFR5 their primary sequence 

20 homology is low (Figure 11). 

5. Expression of the UNFRI, NFR5 and SymRK symbioQc genes is root 
specific and independently regulated. 

The NFR1 dependent root hair curling. In the susceptible zone located just 
25 behind the root tip, Is oonelated wrtth root specific NFR1 gene expression. 
Steady-state NFR1 mRNA levels were measured In different plant organs 
using quantitative real-time PGR and Northern blot analysis as described 
above in section A.7. NFR1 mRNA was only expressed In root tissue, and 
remained below detectable levels in leaves, flowers, pods and nodules, as 
30 shown In Figure 7a. Upon inoculation with M. toff, the expression of NFRI in 
wild type plants is relatively stable for at least 12 days after Inoculation 
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(Figure 7b). Real-time PGR experiments revealed no difference between the 
levels of the two NFR1 transcripts detected In the root RNA. suggesting that 
the alternative splicing of exon 4 Is not differentially regulated. 
NFR1, NFR5 and SymRK gene expression in roots, before and following 
Rhizobium Inoculation, was detennined by real-time PGR in wild type and 
nfr1, nfrS and symrk mutant genotypes. The expression of NFR1, NFR5 and 
SymRK genes In un-inoculated and inoculated roots was not significantly 
Influenced by the symbiotic mutant genotype (Figure 7b. c. d) indicating that 
transcriptional regulation of these genes is mutually independent. 

Example 2. 

Functional properties of the Nod-factor binding element and its 
component NFR proteins 

The functional and regulatory properties of the Nod-factor binding element 
and its component NFR proteins provide valuable tools for monitoring the 
functional expression and specific activity of the NFR proteins. Nod-factor 
perception by the Nod-factor binding element trigger the rhizoblal-host 
Interaction, which includes depolarisation of the plasma membrane, ion 
fluxes, alkalization of the external root hair space of the invasion zone, 
calcium oscillations and cytoplasmic alkalization in epidemial cells, root hair 
morphological changes, infection thread fonnation and the initiation of the 
nodule primordia. These physiological events are accompanied and 
coordinated by the induction of specific plant symbiotic genes, called 
nodullns. For example, the WW gene encodes a putative transcriptional 
regulator facilitating infection thread fomnatlon and Inception of the nodule 
primordia and limits the region of root cell-rtilzobial Interaction competence to 
a nam>w Invasion zone (Geurts and BIsseling. 2002. supra). Since nln 
mutants develop nonnal mycontilza, the NIN gene lies In the rtilzobia-speclfic 
branch of the symbfotic signalling pathway, downstream of the common 
pathway. Ion fluxes, pH changes, root hair defbnnatlon and nodule formation 



38 



are all absent In NFR1 and NFR5 mutant plants, and hence the functional 
activity of these genes must be required for all downstream physiological 
responses. Several physiological and molecular markere that are diagnostic 
of NFR expression are provided below. 

1. Morphological marker of NFR1 and NFR5 gene expression 

When wild type Lotus japonicus plants are inoculated with Mesorhizobium 
Ion, the earliest visible evidence of infection is root hair defomiation and root 
hair curling, which occurs 24 hours after Inoculation, as shown in Figure 8a. 
However, mutant plants carrying the nt1-1 (Figure 8c). nfrf -2, nfr5-1, nfr 5-2 
or nfr5-3 alleles (as in Figure 8c), all failed to produce root hair curling or 
defomnatton, Infection threads or nodule primordia In response to Infection by 
Mesorhizobium lot}\NY&\ all three strains tested (NZP2235. R7A and TONO). 
LIpochltin-oligosaccharides purified from M. loti, R7A strain, which induce 
root hair defonnation and branching in wild type plants (Figure 8b), also failed 
to induce any defonnation of root hairs of the nfr1-l and nfr5-f mutants 
(Figure 8d), evidencing the key role of the NFR1 and NFR5 genes in Nod- 
lactor perception. 

IVlutations in genes expressing the downstream components of the symbiosis 
signalling pathway, namely symRK and nin have clearly distinguishable 
phenotypes. After infection with Mesorhizobium loU, the root hairs of symRK 
plants swell into balloon structures (Figure 8e). while the nin mutants produce 
an excessive root hair response (Figure 8g). The response of double mutants 
carrying nfr1'1/symRK-3 mutant alleles or nfr1-1/nin alleles to 
Mesorhizobium loti Infection (Figure 8f,h) are similar to that afnfrl-l mutants, 
demonstrating that the nfr1-1 mutation is dominant to symRK and nin 
mutattons, and hence detemrilnes an earlier step In the symbiotk; signalling 
pathway. 

2. Physiological marker of NFR1 and NFRS gene expression 
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When the root hairs of wild type Lotus plants are exposed to M. lot! Nod- 
factor, the plasma membrane is depolarised and an alkalisation occurs in the 
root hair space of the Invasion zone, (Figure 9a). The extracellular pH was 
monitored continuously In a flow-through regime using a pH-selective 
microelectrode, placed within the root hair space. Membrane potential was 
measured simultaneously with pH, and the calculated values are based on at 
least three equivalent experiments, each. Mutants carrying nfrl and nfrS 
alleles do not respond normally to Nod-factor stimulation. Two nfrS alleles 
abolish the response to Nod-factors (Figure 9b). while the nfr1-1 allele 
causes a diminished and slower alkalisation, and the nfrf -2 allele causes the 
acidification of the extracellular root hair space (Figure 9c). Both the NFR1 
and NFRS genes are thus essential for mounting the earilest detectable 
cellular and electrophysiological responses to Nod-factor, which can be used 
to monitor their functional activity. 

The early physiological response of the symRK-3 and symRK-l mutant 
plants to Mesorhizobium loti Nod-factor is similar to the wild type (Rgure 9d) 
and clearly distinguishable from the response of both the nfrl and nfrS 
mutants. 

The response of the double mutant, canying nfr1-2/symRK-3 mutant alleles, 
to Nod-factor (Figure 9e) Is similar to that of nfrf -2 mutants, further 
supporting that the nfrf -2 mutation is dominant to symRK-3 and determines 
an earlier step in the symbiotic signalling pathway. 

3. NFR1 and NFRS mediated Nod-factor perception lies upstream of NIN 
and ENOD and is required for their expression. 

The symbiotic exprsssbn of the nodulln genes. Lotus japoniojs ENOD2 
(Niwa. S. etaL. 2001 MPM1 14:848-56) and NIN. in roots following rhizobial 
Inoculation, provides a marker for NFR gene expression. The steady-state 
levels of NIN and ENOD2 mRNA were measured In roots before and 
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following rhizobial Inoculation by quantitative real-time PGR, using the primer 
pairs: 

5'AATGCTCTTGATCAGGCTG3' (SEQ ID No: 26) and 
S*AGGAGCCCAAGTGAGTGCTA3' (SEQ ID No: 27) for amplification of NIN 
mRNA reverse transcripts; and the primer pairs: 
5'CAG GAA AAA CCA CCA CCT GT3' (SEQ ID No:28) and 
S'ATGGAGGCGAATACACTGGTGS- (SEQ ID No: 29) for amplification of 
ENOD2 mRNA reverse transcripts. The identity of the amplified sequences 
was confirmed by sequencing. 

Five hours after inoculation, induction of NIN gene expression was detected 
in the wild type plants, while induction of ENOD2 occurs after 12 days as 
shown in Figure 10a and b. in the nfrl and nfrS mutants, activation of MN 
and EN0D2 was not detected, demonstrating that funcUonal NFR1 and 
NFRS genes can be monitored by the activation of these eariy nodulln genes. 
Lotus plants transformed with a NIN gene promoter region fused to a GUS 
reporter gene provide a further tool to monitor NFR gene function. 
Expression of the /V/W-GUS reporter can be induced in root hairs and 
epldemrjal cells of the root invasion zone following rhizobial inoculation in 
transfonned wild-type plants. In contrast expression of the A//W-GUS reporter 
In an n/H mutant was not detected following riilzobial Inoculation. Likewise. 
/V/AM3US expression was Induced In the Invasion zone of wlldtype plants 
after Nod-factor application, while in a nfrl mutant background no expresston 
was detected The requirement for NFR1 function was conflmied in n1r1-1, nin 
double mutants by the absence of root hair curiing and excessive root hair 
curiing (Fig 8). 

The LJCBP1 gene, T-DNA tagged with a promoter-less GUS in the T90 line, 
Is rapidly activated after M. lot! inoculation as seen for A//W-GUS, thus 
providing an independent and sensitive reporter of eariy noduiin gene 
expression (Wel>b et al. 2000. ly/lolecular Plant-Microbe Interact. 13.606.- 
616). Parallel experiments comparing expression of the LjCBPI promoter 
GUS fusion in wt and nfrl mutant background confirm the requirement for a 
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functional NFR1 for activation of the early response to bacteria and Nod- 
factor. 

Example 3. 

Transgenic expression of NFR polypeptides and complementation of 
the nfr mutants 

The NFR genes, encoding the NFR1 and NFR5 protein components of the 
Nod-factor binding element, can each be stabily integrated, as a transgene, 
into the genome of a plant, such as a non-nodulating plant or a mutant non- 
nodulating plant, by transfonnatlon. Expression of this transgene, directed by 
an operably linked promoter, can be detected by expression of the respective 
NFR protein In the transfomied plant and functional complementation of a 
non-nodulating mutant plant. 

A wlldtype NFR5 transgene expression cassette of 3.6 kb, comprising a 1175 
bp promoter region, the NFR5 gene and a 441 bp 3* UTR was cloned in a 
vector (plV10), and the vector was recomblned Into the T-DNA of 
Agrobacterium rhizogenes strain AR12 and AR1 193 by triparental mating. 
The A/FR5 expression cassette In pIVIO was subsequently transformed Into 
non-nodulating Lotus nUrS-l and /ifr6-2 mutants via Agrobacterium 
i/j/zogenes-mediated transfonnatton according to the standarti protocol 
(Stougaard 1995, Methods In Molecular Biology volume 49, Plant Gene 
Transfer and Expression Protocols, p 49-63) In parallel, control transgenic 
Lotus nfi-5'1 and nfr5-2 mutants plants were generated, which were 
transformed with an empty vector, lacking the NFR5 expression cassette. 
The nodulation phenotype of the transgenic hairy root tissue of the 
transformed mutant Lotus plants was scored after Inoculation with 
Mesorhizobium toil (M. loti) strain NZP2235. Inplanta complementation of the 
nfr5-1 and nfr5-2 mutants by the NFR5 transgene was accomplished, as 
shown in Table 2, with an efficiency of &58%, and the estabUshment of 
nomial rhizobial-legume Interactions and development of nltrx)gen fixing 
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nodules. Complementation was dependent on transfomiatlon with a vector 
comprising the NFRS expression cassette. 

A transgene expression cassette, comprising the wild type NFR1 gene 
comprising 3020 bp of pronrvoter regfon, the NFR1 ORF and 394 bp of 
3'untranslated region, was cloned Into the plV10 vector and recomblned Into 
Agrobacterium rhizogenes strain AR12 and AR1 193 by triparental mating. 
Agrobacterlum rh/zogenes-medlated transfomriation was used to transform 
the gene into non-noduiating Lotus nfr1-1 and nfrl-2 mutants in parallel with 
a control empty vector. In planta complementation of the Lotus nfr1-1 and 

mutants by the NFR1 transgene was accomplished, as shown In 
Table 3, with an efficiency of >60%. and the establishment of normal 
/?h/zo5/um-legume interactions with M. /of/ strain NZP2235. and 
development of nitrogen fixing nodules. Complementatton was dependent on 
transfomiation with a vector comprising the NFR1 expression cassette 

Example 4 



Expression and characterisation of the NFR1, NFRS and SYM10 
proteins In transgenic plants 

NFR1 , NFRS and SYM10 proteins are expressed and purified from 
transgenic plants, by exploiting easy and well described transformation 
procedures for Lotus (Stouqaard 1995, su pra) and tobacco (Draper et 
al.1988. Plant Genetic Transfomiation and Gene Expression. A l..aboratory 
Manual. Blackwell Scientific Publications). Expression In plants Is particularty 
advantageous, since it facilitates the conrect folding of these transmembrane 
proteins and provides for conect post-translational modification, such as 
phosphorylatfon. The primary sequences of the expressed proteins are 
extended with commercially available epitope tags (Myc or FLAG), to allow 
their purification from plant protein extracts, DMA sequences encoding the 
tags are llgated Into the expression cassette for each protein, in frame, either 
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at the 5' or the 3' end of the cDNA coding region. These modified coding 
regions are then operably United to a promoter, and recombined Into 
Agrobacterium rhizogenes. Lotus Is transfomned by wound-site infection and 
from the transgenic roots Independent root cultures are established in vitro 
(Stougaard 1995. supra). NFR1. NFR5 and SYM10 proteins are then purified 
from root cultures by affinity chromatography using the epitope specific 
antibody and standard procedures. Alternatively the proteins are 
immunoprecipitated from crude extracts or from semi-purified preparations. 
Proteins are detected by Western blotting methods. For transfomiation and 
expression in tobacco, the epitope tagged cDNAs are cloned into an 
expression cassette comprising a constitutively expressed 35S promoter and 
a 3'UTR and subsequently Inserted into binary vectors. After transfer of the 
binary vector into Agrobacterium tumefacierts, transgenic tobacco plants are 
obtained by the transfomiation regeneration procedure (Draper et al.1988. 
supra). Proteins are then extracted from cmde or semi-purified extracts of 
tobacco leaves using affinity purification or Immunopreclpltatlon methods. 
The epitope tagged purified protein preparations are used to raise mono- 
specific antibodies towards the NFR1, NFR5 and SYM10 proteins 

Example 5 

Plant breeding tools to select for enhanced noduiation frequency and 
efficiency. 

A successful and efficient primary interaction between a rhizoblal strain and 
its host depends on detection of a fyiizobium strain's unique Nod-factor 
(LCO) profile by the plant host The Nod-factor binding element and its 
component NFR proteins, each with their extracellular Lysiy4 motifs, play a 
key role in controlling this interaction. NFR alleles, encoding variant NFR 
proteins are shown to be correlated with the efficiency and frequency of 
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nodulation with a given rhizoblal strain. IVIoiecujar breeding tools to detect 
and distinguisli different plant NFR alleles, and assays to assess the 
nodulation efficiency and frequency of each allele, provides an effective 
method to breed for nodulation efficiency and frequency. 

Methods useful for breeding fbr nodulation efficiency and frequency are given 
below, and the application of these techniques is illustrated for the NFR 
alleles of Lotus spp. Using the Rhizobium leguminosarum bv viceae 5560 
DZL strain (Bras et al, 2000. Molecular Plant-Microbe Interact. 13, 475-479) it 
is documented that the host range of this strain within the Lotus spp depends 
on the NFR1 and NFR5 alleles present in the Lotus host. When inoculated 
onto wild type plants Rhizobium leguminosarum bv wceae 5560 DZL fonm 
root nodules on Lotus japonicus GIFU but the strain is unable to form root 
nodules on Lotus filicaulis. Transgenic L. filicaulis transformed with the Lotus 
Japonicus GIFU NFR1 and NFR5 alleles do however form root nodules when 
inoculated with the Rhizobium leguminosarum bv wceae 5560 DZL strain 
proving the NFR1/NFR5 allele dependent Nod-factor recognition. 

1. Determining the Nod-factor specificity and sensitivity of NFR alleles. 

Root hair curling and root hair deformation in the susceptible invasion zone is 
a sensitive in vivo assay for monitoring the legume plants ability to recognise 
a Rhizobium strain or the Nod-factor synthesized by a Rhizobium strain. The 
assay Is perfonned on seedlings and established as follows. Seeds of wild 
type, transgenic and mutant Lotus spp are sterilised and germinated for 3 
days. Seedlings are grown on 1/4 B&D medium (l-iandberg and Stougaard, 
1992 supra), between two layers of sterile wet filter paper for 3 days more. 
AftenA^rds, they are transferred into smaller petri dishes containing 1/4 B&D 
medium supplemented with 12.7nM AVG [(S)-trans-2-amino-4-(2- 
aminoethoxy)-3-butenolc acid hydrochloride] (Bras C. et al. 2000 , MPM1 13: 
475-479). On transfer, the seedlings are inoculated \Arith either 20 |jl of 1:100 
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dllutton of a 2 days old MJoti strain NZP2235 culture, or with M.totf strain R7A 
Nod-factor coated sand, or with sterile water as a control, and a layer of wet 
dialysis membrane is used to cover the whole root. A minimum of 30 
seedlings are microscopically analysed for specific deformations of the root 
hairs. The assay detemnines the threshold sensitivity of each L. japonicus , 
for the Nod-factor (LCO) of a given Rhizobium strain and the frequency of 
root hair curiing and/or deformation. 

In an alternative procedure, seeds of Lotus japonicus are surface sterilised 
and germinated for 4 days on 1% agar plates containing half-strength 
nitrogen-free medium (imaizumi-Anraku et al., 1997, Plant Cell Physiol. 38: 
871-881), at 26°C, under a 16h light and 8h dark regime. Straight roots, of 
<1cm in length, on germlings firom each cultivar are then selected and 
transplanted on Fahraeus slides, in a nitrogen-free medium and grown for a 
further 2 days. LCOs, prepared by n-butanol extraction and HPLC separation 
from a given Rhizobium strain (Niwa etal. 2001, MPMl 14; 848-856). are 
applied to the straight roots in each cultivar, at a final concentration range of 
between 10"^ and 10'^ M. After 12 to 24h culture, the roots are stained with 
0.1 % toluene blue and the number of root hairs showing curling is counted. 
The assay determines the threshoW sensitivity of each Lotus spp., carrying a 
given NFR allele, for the Nod-factor (LCO) of a given Rhizobium strain and 
the frequency of root hair curling. 

2. Determining the frequency and efficiency of nodulation of NFR 
alleles. 

The efficiency of a legume plants ability to fomi root nodules after inoculation 
with a Rhizobium strain is determined in small scale controlled nodulation 
tests. Lotus seeds are surface sterilised in 2 % hyperchlorite and cultivated 
under aseptic conditfons in nitrogen free 1/4 concentrated B&D medium. 
After 3 days of germination, seedlings are inoculated with a 2 days old 
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culture of M. toff NZP2235 or TONO or R7A or with the R teguminosarum bv 
viceae 5560DZL strain. In principle a set of plants is only inoculated with one 
stain. For controlled competition experiments where legume-/?Wzob/um 
recognition is determined In a mixed Rhizobium population, a set of plants 
5 can be Inoculated with more than one Rhizobium strain or with an extract 
from a particular soil. Two growth regimes are used: either petri dishes with 
solidified agar or Magenta jars with a solid support of burnt clay and 
vermiculite. The number of root nodules developed after a chosen time 
period is then counted, and the weight of the nodules developed can be 
1 0 determined. The efficiency of the root nodules in terms of nitrogen fixation 
can be detemnined in several ways, for example as the weight of the plants or 
directly as the amount of N1 5 nitrogen incorporated in the plant molecules. 

In an alternative procedure, Lotus seeds are surface sterilised and vernalised 
at 4*'C for 2 days on agar plates and germinated overnight at 28°C. The 

1 5 seedlings are Inoculated with Mesortiizobium loti strain NZP2235. TONO or 
R7A LCOs (as described above) and grown in petri dishes on Jensen agar 
medium at 20^*0 in 8h dark, 16h light regime. The number of nodules present 
on the plant roots of each cultlvar Is determined at 3 days Intervals over a 
period of 25 days, providing a measure of the rate of nodulation and the 

20 abundance of nodules per plant 

3. Determining nodule occupancy In relation to NFR allele 

In agriculture the NFR Nod-factor binding element recognises Rhizobium 
bacteria under adverse soli conditions. The final measure of a particular 
strain's or commercial Rhizobium inoculum's ability to compete with the 
25 endogenous Rhizobium soli population for Invasion of a legume crop with 
particular NFR alleles, is root nodule occupancy. The proportion of nodules 
fomned after Invasion by a particular strain and the fraction of the particular 
Rhizobium strain inside Individual root nodules is detenmined by surface 
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sterilising tlie root nodule surface in hyperchlorite, followed by crushing of tlie 
nodule Into a crude extract and counting the colony fomilng Rhizobium units 
after dilution of the extract and plating on medium allowing Rhizobium growth 
(Vincent.. JM. 1970, A manual for the practical study of root nodule t>acteria. 
IBP handbook no. 15 Oxford Blackwell Scientific Publications, Lbpez-Garcia 
et al, 2001 , J Bacterioi, 1 83,7241 -7252). 

4. NFR1 and NFR5 are determinants of host range In Lotus-RMzobium 
interactions. 

Wild type Lotus japonicus GIfu is nodulated by both Rlilzobium 
leguminosarum bv. viclae 5560 DZL (R. leg 5560DZL) and Mesorhizobium 
loti NZP2235 (MJoti NZP2235), while wild type Lotus filicauiis is only 
nodulated by M./off NZP2235. Transgenic Lotus filicauiis plants expressing 
the NFR1 and NFR5 alleles of Lotus Japonicus Gifu, are nodulated by R. leg 
5560DZL, clearly demonstrating that the NFR alleles are the primary 
determinants of host range. 

Lotus filicauiis was transformed with vectors comprising NFR1 and NFR5 
wild type genes and their cognate promoters from Lotus japonicus GIfu or 
with empty vectors. The Lotus filicauiis transfonnants carrying NFR1 and 
NFR5 are nodulated by R. leg 5560DZL, albeit at reduced 
efficiency/frequency (9.6%) compared to Lotus japonicus Gifu (100%). as 
shown In Table 4. Mixing of NFR subunits from Lotus japonicus and Lotus 
filicauiis in the Nod-factor binding element is lilcely to contribute to the 
reduced efficiency obsen^ed. These data demonstrate that riiizobial strain 
recognition specificity is detennlned by the NFR1 and NFR5 alleles and that 
breeding for specific NFR alleles present in the germplasm or in wild relatives 
can be used to select optimal legume-/?/}/zob/(/m partners. 
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More detailed investigations siiow that tl^e rt^izobiai strain recognition 
specificity of tiie NFRS and NFR1 alleles is determined by the extracellular 
domain of the NFRS and NFR1 proteins. Mutant Lotus Japonicus nfrS was 
transformed with a wild type hybrid NFRS gene "FInGS". encoding the 
extracellular domain from L fillcaulis NFRS fused to the kinase domain from 
L japonicus Grfu NFRS (Figure 12). The hybrid gene was operabiy linked to 
the wild type NFRS promoter. Control transformants. comprising wild type L. 
Japonicus Gifu, L. filicautts and the Lotus japonicus nfrS mutant, transfomied 
with an empty vector, are generated in parallel. The transformed plants are 
infected either with MJoti N2P2235 or with R. /eg5560 DZL and the fomiatlon 
of nodules monitored, as shown in Table 5. The FinGS hybrid gene 
complements the nfrS mutation, and 88% of the transformants are nodulated 
by MJoti N2P2235 showing that the hybrid gene Is functionally expressed. 
However, the nfrS mutants expressing the FinGS hybrid gene are very poorly 
nodulated by RJeg 5560 DZL. only 3 %, (conesponding to one plant) even 
after prolonged infectfon (40 days). This demonstrates that strain specificity 
of the Nod-factor binding element is detennined by the extracellular domain 
of its component NFR proteins. 

In parallel, the Lotus Japonicus nfrl mutant was transfbmied with a wIM type 
hybrid NFR1 gene "FInGr, encoding the extracellular domain from L. 
mcaulis NFR1 fused to the kinase domain from L. Japonicus Gifii NFR1 
(Figure 12). The hybrid gene was operabiy linked to the wild type NFR1 
promoter. The transfonned plant were infected either wHh M.loti NZP223S or 
with R. leg SS60 DZL and the formation of nodules was monitored, as shown 
in Table 6. 

The FInGI hybrid gene complements the nfrf-f mutation, and 100 % of the 
transformants were nodulated by A/f./otf NZP2235. However nfrf-^ mutants 
expressing the FinG1 hybrid gene were less efficiently nodulated (30-40%) 
by R. ieg 5560 DZL. Furthermore, their nodulation by R. leg S560 DZL was 
much delayed compared to their nodulation by M. loti NZP223S. Thus the 
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Lotus I R. leg 5560 DZL interaction Is less efficient and delayed when the 
transgenic host plant expresses a hybrid NFR1 comprising the extracellular 
domain of Lotus fillcaulis NFR1 with the kinase domain of Lotus Japonicus 
Gifu NFR1 . These data indicate that the specific recognition of R.leg 5560 
DZL by its Lotus host Is at least partly specified by the extracellular domain of 
NFR1 (Gifu) and that this is an allele specific recognition. However, the NFR5 
allele appears to be more important for specific recognition than NFR1, 

5. NFR5 alleles and their molecular markers 

The NFR5 Nod-factor binding proteins encoded by the NFR5 alleles of Lotus 
japonicus ecotype GIFU (gene sequence: SEQ ID No: 7; protein sequence: 
SEQ ID No: 24 & 25), and Lotus rrilcaulis (gene sequence SEQ ID No: 30; 
protein sequence SEQ ID No: 31) have been compared, and found to show 
diversity in their primary structure. Using the sequence information available 
for the Lotus NFR5 gene together with the pea SYM10 gene (Table 8), the 
alleles from different ecotypes or varieties of Lotus, pea and other legumes 
can now be identified, and used directly in breeding programs. Molecular 
markers based on DNA polymorphism are used to detect the alleles in 
breeding populattons. Similar use can be taken of the NFR1 sequences. 
Molecular DNA markers, based on the NFRS allele sequence differences of 
Lotus and pea. are highlighted in Tables 8 and 9 as examples of how DI^ 
polymorphism can be used directly to detect the presence of an 
advantageous allele in a breeding population. 

Breeding for an advantegeous allele can also be canted out using molecular 
markers, that are genetically linked to the allele of interest, but tocated 
outside the gene-ailele Itself. Breeding of new Laius japonicus lines 
containing a desired NFR5 allele can. for exampte. be facilitated by the use 
of DNA polymorphisms, (simple sequence repeats (microsatelittes) or single 
nudeotWe poiynrarphism (SNP) whteh are found at loci, genetically linked to 
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NFR5. Microsatelittes and SNPs at the NFR5 locus are Identified by 
transferring markers from the general map. by identification of AFLP marlcers, 
or. by scanning the nucleotide sequence of the BAC and TAC clones 
spanning the NFR5 locus, for DNA polymorphic sequences located In close 
proximity of the NFR5 gene. Table 7 lists the markers closely linked to NFR5 
and the sequence differences used to design the microsatelltte or SNP 
markers. This principle of marker assisted breeding, using genetically linked 
markers, can be applied to all plants. Microsatellite markers which generate 
PCR products with a high degree of polymorphism, are particularly useful for 
distinguishing closely related individuals, and hence to distinguish different 
NFR5 ofNFRI alleles in a breeding program. 



Table 1 

Summary of Lotos nfrS and pea symlO mutant alleles 



Allele 


Mutafion 


Lotus Spp 


sym5-l 


bYAeNG$U3fiO-388 deletion 




sym5-2 


retrotransposon integration after 
Q233 


Lj 


sym5-3 


CAQ->TAG, Q55->8top 


u 


^FixG 


TeG-*TGA. W388->Stop 







TGG-^TGA, W405->8top 


Ps 




CAA->TAA, Q2oo->8top 


PS 


1^15 


aymio gene deleted 


Ps 
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TABLE 2 



Complementation of L otus iaoonicus nftS mutants with the wlldtvpe NFRB 

transgene 



Lotus 

genotype 


Transgene 


No. 
of plants 


Infected 
With 


No. of plants 
with nodules* 


Total No. 
of nodules 


nfr5-1 


NFR5 


31 


M.loti 
NZP2235 


18 


nd 


nnrS-l 


Empty 
vector 


So 


M.loti 
NZP2235 


0 


nd 


nfr5-2 


NFR5 


5 


MJoti 
NZP2235 


1 


nd 


n&5-2 


Empty 
vector 


5 


MJoti 
N2P2235 


0 


nd 



- Nodules only detected on transfonned roots 

TABLE 3 



Transformation of Lotu s iaoonicus nfrl mutants with the wiidtvpe NFR1 

transgene 



Lotus 

genotype 


Transgene 


No. 

Of plants 


infected 
With 


No. plants 

with 

nodules 


Total No. 
of 

nodules 


Average No. 

nodules/ 

plant 


ntrl'1 


NFRi — 


103 


MJoti 
N2P2235 


62* 


310 


5 


nfr1-l 


Empty 
vec^r 


30 


M.loti 
NZP2235 


0 


0 


0 


nftl-2 


— wm — 


20 


MJoti 
NZP2235 


•I3* 


97 


7.5 


nfr1'2 
" Nodules i 


empty 
vector 
onlv detectec 


7 

on transfin 


MJoti 
NZP2235 


0 


0 


0 
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Table 4 



Lotos micaulls transform ed with wildtvpe NFR1 and NFR5 genes from 
Lotus iaoonicus Gifu " 



Lotus 

genotype 


Yransgene 


No. 

Of plants 


Infected 
with 


No. plants 

with 

nodules 


Total No. 
of 

nodules 


Average No. 

nodules/ 

plant 


Lotus 
filicaulis 




NFR5 


164 


RJeg 
5560 D2L 


10* 


25 


2.5 


Lotus 
filicaulis 


Empty 
vector 


65 


RJeg 
5560 DZL 


0 


0 


0 


Lotus 

japonicus 

Gifu 

* NnHi iloc i 


Empty 
vector 


10 


RJeg 
5560 DZL 

~ i rr—-^ 


10- 


>150 


>15 



Nodules on nonnal and transformed roots 



54 



Table 5 

4- yapon/cMS nfrS mutant transformed wi th a hybrid NFRS gene "FInGS" enc oding 
the extracellular domain of LMicaulis NFR5 fus ed to the kinase domain from L. 
Japonicus GIfu NFRS. 



Lotus 

genotype 


Transgene 


No. of 
plants 


Infected 
with 


Mo. of 
plants with 
nodules 


Total No. 
of 

nodules 


Averacie No 

nodules/ 

plant 


n&5 


FlnG5 


31 


MJoti 
NZP2235 


28* 


-180 


6.4 


ntrS 


^mpty 
vector 


12 


MJoti 
NZP2235 


0 


0 




ntrS 


FinG5 


34 


HJeg 
6560 DZL 


" 1* 


4 


4 

1 PLANT 
ONLY 


ntrS 


empty 
vector 


Id 


RJeg 
5560 DZL 


0 


0 




Lotus 

Japonicus 

Gifu 


empty 
vector 


10 


RJeg 
5560 DZL 


10** 


>150 


>15 


Lotus 

fincaulis 

' i>Jodules onl 


empty 
vector 

V detected or 


29 
1 trarififinm 


RJeg 
5560 DZL 


0 


0 





Nodules on normal and transfonned roots 
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Table 6 



. ^Japonicus nfrt mutant transformed with a hybrid NFR1 gene "FinG1" encod ing 
LJa^n^T^L^^ ^.ff/fcau/fa NFR1 f Jsed to the ignase domain frn»r ^ 



Lotus 

genotype 


Transgene 


No. 

of plants 


Infected 
with 


No. Of 
plants with 
nodules 


Total No. 
of 

nodules 


Average No. 

nodules/ 

plant 


nfrl-1 


FinGI 


8 


M.loti 
NZP2235 


8* 


59 


7.3 


n&l-l 


Empty 
vector 


6 


MJoti 
NZP2235 


0 


0 


0 


nrn-1 


FInGI 


13 


R.leg 
5560DZL 


5*# 


15 


3 


ntn-l 


Empty 
vector 


9 


R.leg 
5560DZL 


0 


0 


0 


nfr1-2 


Finei 


10 


R. leg 
5560DZL 


3*# 


12 


4 


^ Nodules 


fempty 
vector 

only detected 


4 

on transfc 


«. leg 
5560DZL 
irnru^rl mnts 


0 


0 


0 



detectable after -^25 days. 



were 
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Table 7 



Molecular markets for NFR5 allele breeding in Lotos 



Marker 


Genetic 
distance from 
NFR5 locus 


Lotus 
Ecotype 


Microsatellite 
sequence 


TM0272 


2.9cM 


MG-20 


ISxCT 






Gifu 


12xCT 


TM0257 


1,0cM 


MG-20 


lOxAAG 






Gifu 


7XAAG 


LjT13i23Sfi 




Gifu 


TTTTGCTGCAGCAAGTCAGACTGTTAGAGGA 






Pit! ' 

Fill 


TTTTGCTGCAACAAGTCGGACTGTTAGAGGA 


TM0522 


OcM 


MG-20 


24xAT 






Gifu 


14XAT 


NFR5 
















E32M54-12F 


0.5cM 


MG.20 


TTGGAAGTTCTTTTTATTAG6TTAATTTTA 






Rli 


TTGGAAGTTCTTTTTA GGTTAATTTTA 


LjT01c03 Not 


0,7cM 


Fill 


CATTCCAGAAGAAAATAAGATATAATTATG 






MG.20 


CATTCCAGAAGAAAATAAGATATAATTATG 






Gifu 


CATTCCAGAAGAAAATAAGATATAATTATG 


TM0168 


2.2cM 


MG-2d 


19XAT 






Gifu 


ISxAT 


TM0021 


3,8cM 


MG-20 


16xCT 






Gifu 


13xCT 
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Table 8 



Nucleotide sequence v ariation between the pea SYM10 alleles 
of pea cuHivars Frisson and Rnale* 



Frisson 
Finale 


CTTGCATTTC TTCACAATTT 
CTTGCATTTC TTCACAATTT 


CACAACAATG 
CACAACAATG 


GCTATCTTCT 
GCTATCTTCT 


TTCTTCCTTC 
TTCTTCCTTC 


Frisson 
Finale 


TAGTTCTCAT GCCCTTTTTC 
TAGTTCTCAT GCCCTTTTTC 


TTGCACTCAT 
TTGCACTCAT 


GTTTTTTGTC 
GTTTTTTGTC 


ACTAATATTT 
ACTAATATTT 


Frisson 
Finale 


CAGCTCAACC ATTACAACTC 
CAGCTCAACC ATTACAACTC 


AGTGGAACAA ACTTTTCATG 
AGTGGAACAA ACTTTTCATG 


CCCGGTGGAT 
CCCGGTGGAT 


Frisson 
Finale 


TCACCTCCTT CATGTGAAAC 
TCACCTCCTT CATGTGAAAC 


CTATGTQACA 
CTATGTGACA 


TACTTTGCTC 
TACTTTGCTC 


GGTCTCCAAA 
GGTCTCCAAA 


Frisson 
Finale 


CTTTTTGAGC CTAACTAACA 
CTTTTTGAGC CTAACTAACA 


TATCAGATAT 
TATCAGATAT 


ATTTGATATG 
ATTTGATATG 


AGTCCTTTAT 
AGTCCTTTAT 


Frisson 
Finale 


CCATTGCAAA AGCCAGTAAC ATAGAAGATG 
CCATTGCAAA AGCCAGTAAC ATAGAACSATfi 


AGGACAAGAA 
AGGACAAGAA 


GCTGGTTGAA 
GCTGGTTOAA 


Frisson 
Finale 


GGCCAAGTCT TACTCATACC 
GGCCAAGTCT TACTCATACC 


TGTAACTTGT 
TGTAACTTQT 


GGTTGCACTA GAAATCGCTA 
GGTTGCACTA GAAATCGCTA 


Frisson 
Finale 


TTTCGCGAAT TTCACGTACA 
TTTCGCGAAT TTCACGTACA 


CAATCAAGCT 
CAATCAAGCT 


AGGTGACAAC 
AGGTGACAAC 


TATTTCATAG 
TATTTCATAG 


Frisson 
Finale 


TTTCAACCAC TTCATACCAG 
TTTCAACCAC TTCATACCAG 


AATCTTACAA ATTATGTGGA AATGGAAAAT 
AATCTTACAA ATTATGTGGA AATGGAAAAT 


Frisson 
Finale 


TTCAACCCTA ATCTAAGTCC AAATCTATTG 
TTCAACCCTA ATCTAAGTCC AAATCTATTG 


CCACCAGAAA tCAAAGTTGT 
CCACCAGAAA TCAAAGTTGT 


Frisson 
Finale 


TGTCCCTTTA TTCTGCAAAT GCCCCTCGAA 
TGTCCCTTTA TTCTGCAAAT GCCCCTCrSAA 


GAATCAGTTG AGCAAAGGAA 
GAATCAGTTG AGCAAAGGAA 


Frisson 
Finale 


TAAAGCATCT GATTACTTAT 
TAAAGCATCT GATTACTTAT 


GTGTGGCAGG 
GTGTGGCAGG 


CTAATGACAA 
CTAATGACAA 


TGTTACCCGT 
TGTTACCCGT 


Frisson 
Finale 


GTAAGTTCCA AGTTTGGTGC ATCACAAGTG 
GTAAGTTCCA AGTTTGGTGC ATCACAAGTfi 


GATATGTTTA 
GATATGTTTA 


CTGAAAACAA 
CTGAAAACAA 


Frisson 


TCAAAACTTC ACTGCTTCAA 


CCaaHrttpp 


GATTTTGATC 
GATTTTGATC 


CCTGTGACAA 
CCTGTGACAA 


Finale 


TCAAAACTTC ACTGCTTCAA CCAAflcTTCC 
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Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 



AGTTACCGGT AATTGATCAA CCATCTTCAA ATGGAAGAAA AAACAGCACT 
AGTTACCGGT AATTGATCAA CCATCTTCAA ATQGAAQAAA AAAPiiariir^ 

CAAAAACCTG CTTTTATAAT TGGTATTAGC CTAGGATGTG CTTTTTTCGT 
CAAAAACCTG CTTTTATA AT TGGTATTAGC CTAGGATQTG CTTTTTTCQT 

TGTAGTTTTA ACACTATCAC TTGTTTATGT ATATTGTCTG AAAATGAAGA 
TGTAGTTTTA ACACTATgA C TTGTTTATGT ATATTQTCTG AAAATGAAGA 

GATTGAATAG GAGTACTTCA TTGGCGGAQA CTGCG6ATAA GTTACTTTCA 
GATTGAATAG GAGTAOTT CA TTGQOGGAGA CTGCGGATAA GTTACTTTgA 

GGTGTTTCGG GTTATGTAAG CAAGCCAACA ATGTATGAAA TGGATGCGAT 
GGTGTTTCGG GTTATGTAA Q CAAGCCAAgA ATGTATGAAA TGQATGCGAT 

CATGGAAGCT ACAATGAACC TGAGTGAGAA TTGTAAGATT GGTGAATcBg 
CATGGAAGCT ACAATGAA QC TGAQTGAGAA TTGTAAQA TT GCTGHATrP 

TTTACAAGGC TAATATAGAT GGTAGAGTTT TAGCAGTGAA AAAAATCAAG 
TTTACAAGGC TAATATAGAT GGTAGAGTTT TAGCAGTGAA AAAAATCAAG 



aaagatgctt ctgaggagct gaaaat 
aaagatg ctt ctgaggagct gaaaat' 




CAGAAGGTAA ATCATGGAAA 
CAGAAGQTAA ATCATGGAAA 



TCTTGTGAAA CTTATGGGTG TGTCTTCCGA CAAc^. 
TCTTGTGAAA CTTATGG QTG TGTCTTCCGA CAACG; 




AACTGTTTCC 
AACTGTTTCC 



Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 



TTGTTTACGA GTATGCTGAA AATGGATCAC TTGATGAGTG GTTQTTCTCA 
TTGTTTACGA GTATGCTGAA AATGGATCAg TTGATGAGTQ GTTGTTgTrA 

^^IIJ'^^^^ AAACTTCGAA CTCGGTGGTC TCGCTTACAT GGTCTCAGAG 
GAGTiSGTCGA AAACTTCGAA CTCGGT G GTC TCGCTTACAT GGTOTrAGaG 

AATAACAGTA GCAGTGGATG TTGCAGTTGG TTTGCAATAC ATGCATGAAC 
AATAACAGTA GCAGTQG ATG TTGCAGTTGG TTTGCAATAC ATQPATGA&r' 

ATACTTACCC AAGAATAATC CACAGAGACA TCACAACAAG TAATATCCTT 
ATACTTACCC AAGAATAATC CACAGAGACA TCACAAgAAG TAATATCCTT 

CTGGATTCAA ACTTTAAGGC CAAGATAGCG AATTTTTCAA TGGCCAGAAC 
CTGGATTCAA ACTTTA AGGC CAAGATAGCG AATTTTTCAA TGGCCAGAAC 

TTCAACAAAT TCCATGATGC CGAAAATCGA TGTTTTCGCT TTTGGGGTGG 
TTCAACAAAT TCCATG ATGC CGAAAATCGA TQTTTTCQCT TTTGGGGTQQ 

TTCTGATTGA GTTGCTTACC GGCAAGAAAG CGATAACAAC GATGGAAAAT 
TTCTGATTGA GTTGCTTACC GGCAAGAAAG CGATAACAAC r^TGGAAAAT 
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Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 



GGCGAGGTGG TTATTCTGTG GAAGGATTTC TGGAAGATTT TTGATCTAGA 
GGCGAGGTGG TTATTCTQ TG GAAGGATTTC TGGAAGATTT TTGATCTAGA 

AGGGAATAGA 6AAGAGAGCT TAAGAAAATG GATGGATCCT AAGCTAGAGA 
AGGGAATAGA GAAGAG AGCT TAAGAAAATG GATGGATCCT AA6CTAGAGA 

ATTTTTATCC TATTGATAAT GCTCTTAGTT TGGCTTCTTT GGCA6TGAAT 
ATTTTTATCC TATTGATAA T GCTCTTAGTT TGGCTTCTTT GGCAGTQAAT 

TGTACTGCAG ATAAATCATT GTCAAGACCA AGCATTGCAG AAATTGTTCT 
TGTACTGCAG ATAAATCAT T GTCAAGACCA AGCATTGCAG AAATTOTTCT 

TTGTCTTTCT CTTCTCAATC AATCATCATC TGAACCAATG TTAGAAAGAT 
TTGTCTTTCT CTTCTCAATC AATCATCATC TGAACCAATG TTAGAAAGAT 



Frisson 
Finale 


CCTTGACATC 
CCTTGACATC 


TGGTTTAGAT 
TGGTTTAGAT 


Frisson 


GTAGCTCGTT 


GATATTCATT 


Finale 


GTAGCTCGTT 


GATATTCATT 


Frisson 


AGTTTCTTAT 


ATTCAAGATG 


Finale 


AGTTTCTTAT 


ATTCAAGATG 


Frisson 


TCTTTATGTG 


TGGAACTATA 


Finale 


TCTTTATGTG 


TGGAACTATA 


Frisson 


i^bxCATTTT 


TCCATGTT 


Finale 


AgTTCATTTT 


TCCATGTT 




TCAATGCTTC 
TCAATGCTTC 



Nucleotide differences are shaded black and the coding region is underlined 
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Table 9 



Protein sequence diffe rences encoded bv the pea SYM10 alleles 
of pea cuWvars Frisson and Final^ 



Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 

Frisson 
Finale 



MAIFFLPSSS HALFLALMFP VTNISAQPLQ LSGTNFSCPV DSPPSCETYV 
MAIFPLPSSS HALFLALMFP VTNISAQPLQ LSGTNFSCPV DSPPSCETYV 

TYFARSPNFL SLTNISDIFD MSPLSIAKAS NIEDEDKKLV EGQVLLIPVT 
TYFARSPNFL SLTNISDIFD MSPLSIAKAS NIEDEDKKLV EGQVLLIPVT 

CGCTRNRYFA NFTYTIKLGD NYFIVSTTSY QNLTNYVEME NFNPNLSPNL 
CGCTRNRYFA NFTYTIKLGD NYFIVSTTSY QNLTNYVEME NFNPNLSPNL 

LPPEIKVWP LFCKCPSKNQ LSKGIKHLIT YVWQANDNVT RVSSKFGASQ 
LPPEIKWVP LFCKCPSKNQ LSKGIKHLIT YVWQANDNVT RVSSKFGASQ 

VDMFTENNQN FTASTNVPIL IPVTKLPVID QPSSNGRKNS TQKPAFIIQI 
VDMFTENNQN FTASTNVPIL IPVTKLPVID QPSSNGRKNS TQKPAFIIGI 

SLGCAFFWV LTLSLVYVYC LKMKRLNRST SLAETADKLL SGVSGYVSKP 
SLGCAFFVW LTLSLVYVYC LKMKRLNRST SLAETADKLL SGVSGYVSKP 

TMYBMDAIME ATMNLSENCK IGBSVYKANI DGRVLAVKKI KKDASEELKI 
TMYEMDAIME ATMNLSENCK IGESVYKANI DGRVLAVKKI KKDASEELKI 

LQKVNHGNLV KLMGVSSDnJ GNCFLVYEYA ENGSLDEWLP SeHskTSNSV 
LQKVNHGNLV KLMGVSSDNg GNCFLVYEYA ENGSLDEWLP SeBskTSMSV 

VSLTWSQRIT VAVDVAV6LQ YMHEHTYPRl IHRDITTSNI LLDSNPKAKI 
VSLTWSQRIT VAVDVAVGLQ YMHEHTYPRl IHRDITTSNI LLDSNPKAKI 

ANPSMARTST NSMMPKIDVP AFGWLIELL TGKKAITTME NGEWILWKD 
ANFSMARTST NSMMPKIDVP AFGWLIELL TGKKAITTME NGEWILWKD 

PWKIFDLEGN RBESLRKWMD PKLENPYPID NALSLASLAV NCTADKSLSR 
FWKIFDLEGN REBSLRKMMD PKLENPYPID NALSLASLAV NCTADKSLSR 

PSIAEIVLCL SLLNQSSSEP MLERSLTSGL DVEATHWTS IVAR 
PSIAEIVLCL SLLNQSSSEP MLERSLTSGL DVEATHWTS IVAR 



* Amino acid differences are liighlighted In blacit 
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Claims 

1 . A Nod-factor binding element comprising one or more isolated NFR 
polypeptides. 

2. The Nod-factor binding element of claim 1 . wiierein the NFR 
polypeptide is NFR1 , comprising an amino acid sequence 
substantially identical to SEQ ID No: 25. and having specific Nod- 
factor binding properties. 

3. The Nod-factor binding element of claim 1 . wherein the NFR 
polypeptide is NFR5 comprising an amino add sequence substantially 
identical to SEQ ID No: 8 and having specific Nod-factor binding 
properties. 

4. The Nod-factor binding element of claim 1 , comprising the NFR 
polypeptides NFR1 and NFR5, comprising amirto acid sequences 
substantially identical to SEQ ID No: 25 and SEQ ID No: 8, 
respectively, and having specific Nod-factor binding properties. 

5. The Nod-factor binding element of claim 2 or 4. wherein said NFR1 
polypeptide is encoded by an NFR1 gene comprising a nucleic acid 
sequence substeintially identical to SEQ ID No: 23. 

6. The Nod-factor binding element of claim 3 or 4. wherein said NFR5 
polypeptide is encoded by an NFR5 gene comprising a nucleic acid 
sequence substantially Identical to SEQ ID No: 7. 

7. An isolated polynucleotide molecule comprising a nucleic acid 
sequence encoding the NFR1 polypeptide of claim 2 that Is 
substantially identical to SEQ ID No: 23. 

8. An isolated polynucleotide molecule comprising a nucleic acid 
sequence encoding the NFR5 polypeptide of claim 3 that is 
substantially identical to SEQ ID No: 7. 

9. A method of producing a plant expressing the Nod-factor binding 
element of claims 2 or 3, the method comprising introducing Into the 
plant a transgenic expression cassette comprising a nucleic acid 
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sequence encoding a NFR polypeptide that is substantially identical to 
SEQ ID No: 25 or SEQ ID No:8. and having specific Nod-fedor 
binding properties, wherein the nucleic acid sequence is operabiy 
linked to a promoter and selecting transgenic plants and their progeny 
expressing said NFR polypeptide. 

10. The method of claim 9, wherein the transgenic expression cassette is 
introduced into the plant through a sexual cross. 

1 1 .The method of claim 9, wherein said promoter is a native promoter or 
heterologous root specific promoter. 

12. The method of claim 9, wherein said promoter is a native or 
heterologous constitutive promoter. 

13. A transgenic plant expressing one or more NFR polypeptides 
produced according to the method of any one of claims 9 to 12. 

14. The transgenic plant of claim 13, expressing NFR 1 and NFR5 
polypeptides comprising an amino acid sequence substantially 
identical to SEQ ID No: 25 and SEQ ID No: 8. respectively and having 
specific Nod-factor binding properties. 

15. The transgenic plant of claim 13. wherein the plant Is a non-nodulating 
dicotyiedenous plant. 

16. The transgenic plant of daim 13, wherein the plant Is a 
monocotyledonous cereal. 

17. A method for marker assisted breeding of NFR alleles, encoding 
variant NFR polypeptides, comprising the steps of: 

a. identifying variant NFR1 or NFR5 polypeptides In a nodulating 
legume species, comprising an amino ackJ sequence 
substantially similar to SEQ ID No: 25 or SEQ ID No: 8 
respectively, having specific Nod-factor binding properties, and 

b. detenmining the nodulation frequency of legume plants 
expressing said variant NRF1 or NFR5 polypeptide, and 
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c. identifying DNA polymorpiiisms at loci genetically linked to or 

within tlie allele encoding said variant NFR1 or NFR5 

polypeptide, and 
d* preparing molecular markers based on said DNA 

polymorphisms, and 
e. using said molecular markers for the identification and selection 

of plants carrying NFR alleles encoding said variant NFR1 or 

NFR5 polypeptides. 

18. Plants selected according the method of claim 17, carrying NFR 
alleles encoding variant NFR1 or NFR5 polypeptides comprising an 
amino acid sequence substantially similar to SEQ ID No: 25 or SEQ ID 
No: 8, respectively, and having specific Nod-factor binding properties, 

19. Use of the method of claim 16 for breeding legumes with enhanced 
nodulation frequency or root nodule occupancy or enhanced symbiotic 
nitrogen fixation ability. 
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Abstract 

The present invention provides a Nod-fector binding element, comprising one 
or more NFR polypeptides encoded by NFR genes, that are useful for 
providing non-nodulating plants with Nod-factor binding properties and 
triggering the endosymbiotic signalling path\way leading to nodulation. 
Furthermore the Invention is useful for breeding for improved nodulation in 
nodulating legumes. 
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SEQUENCE LISTING 




<110> 0stjysk Innovation A/S 

Modtaget 

<120> Nod-factor perception 

<130> P20030064 6 DK 

<160> 31 

<170> Patentin version 3.1 

<210> 1 

<211> 45 

<212> DNA 

<213> Lotus japonicus 

<400> 1 

ctaatacgac tcactatagg gcaagcagtg gtaacaacgc agagt 45 

<210> 2 

<211> 29 

<212> DNA 

<213> Lotus japonicus 



<210> 3 
<211> 21 
<212> DNA 

<213> Lotus japonicus 
<400> 3 

aaagcagcat tcatcttctg g 21 



<210> 4 

<211> 39 

<212> DNA 

<213> TUrtificial 



<210> 5 
<211> 19 
<212> DNA 

<213> Lotus japonicus . 
<400> 5 

gcaagggaag gtaattcag 19 



<400> 2 

gctagttaaa aatgtaatag taaccacgc 



29 



<400> 4 

gaccacgcgt atcgatgtcg actttttttt ttttttttv 



39 



<210> 6 
<211> 2292 



<212> DNA 

<213> Lotus japonicus 
<400> 6 

ttattgatat actaaaccac aggatatttt attgacaatg tgaatgttcc atattttcaa 60 

caatgctgat tccctctgat aaagaacaag ttccttttct ctttccctgt taactatcat 120 

ttgttcccca cttcacaaac atggctgtct tctttcttac ctctggctct ctgagtcttt 180 

ttcttgcact cacgttgctt ttcactaaca tcgccgctcg atcagaaaag attagcggcc 240 

cagacttttc atgccctgtt gactcacctc cttcttgtga aacatatgtg acatacacag 300 

ctcagtctcc aaatcttctg agcctgacaa acatatctga tatatttgat atcagtcctt 360 

tgtccattgc aagagccagt aacatagatg cagggaagga caagctggtt ccaggccaag 420 

tcttactggt acctgtaact tgcggttgcg ccggaaacca ctcttctgcc aatacctcct 480 

accaaatcca gctaggtgat agctacgact ttgttgcaac cactttatat gagaacctta 540 

caaattggaa tatagtacaa gcttcaaacc caggggtaaa tccatatttg ttgccagagc 600 

gcgtcaaagt agtattccct ttattctgca ggtgcccttc aaagaaccag ttgaacaaag 660 

ggattcagta tctgattact tatgtgtgga agcccaatga caatgtttcc cttgtgagtg 720 

ccaagtttgg tgcatcccca gcggacatat tgactgaaaa ccgctacggt caagacttca 780 

ctgctgcaac caaccttcca attttgatcc cagtgacaca gttgccagag cttactcaac 840 

cttcttcaaa tggaaggaaa agcagcattc atcttctggt tatacttggt attaccctgg 900 

gatgcacgtt gctaactgca gttttaaccg ggaccctcgt atatgtatac tgccgcagaa 960 

agaaggctct gaataggact gcttcatcag ctgagactgc tgataaacta ctttctggag 1020 

tttcaggcta tgtaagcaag ccaaacgtgt atgaaatcga cgagataatg gaagctacga 1080 

aggatttcag cgatgagtgc aaggttgggg aatcagtgta caaggccaac atagaaggtc 1140 

9ggttgtagc ggtaaagaaa atcaaggaag gtggtgccaa tgaggaactg aaaattctgc 1200 

agaaggtaaa tcatggaaat ctggtgaaac taatgggtgt ctcctcaggc tatgatggaa 1260 

actgtttctt ggtttatgaa tatgctgaaa atgggtctct tgctgagtgg ctgttctcca 1320 

agtcttcagg aaccccaaac tcccttacat ggtctcaaag gataagcata gcagtggatg 1380 

ttgctgtggg tctgcaatac atgcatgaac atacctatcc aagaataata cacagggaca 1440 

tcacaacaag taatatcctt ctcgactcga acttcaaggc caagatagcg aatttcgcca 1500 

tggccagaac ttcgaccaac cccatgatgc caaaaatcga tgtcttcgct ttcggggtgc 1560 

ttctgataga gttgctcacc ggaaggaaag ccatgacaac caaggagaac ggcgaggtgg 1620 

ttatgctgtg gaaggatatg tgggagatct ttgacataga agagaataga gaggagagga 1680 



tcagaaaatg gatggatcct aatttagaga gcttttatca tatagataat gctctcagct 1740 

tggcatcctt agcagtgaat tgcacagctg ataagtcttt gtctcgaccc tccatggctg 1800 

aaattgttct tagcctctcc tttctcactc aacaatcatc taaccccaca ttagagagat 1860 

ccttgacttc ttctgggtta gatgtagaag atgatgctca tattgtgact tccattactg 1920 

cacgttaagc aagggaaggt aattcagttt ctcatcaaat tgatcaagat gcactttgtt 1980 

tgcgtggtta ctattacatt tttaactagc tatttgctta tttctctgta tttatttgtc 2040 

agacactgga attgaatatc atatgatgga ggagttgtct gttaatacat gtgctaataa 2100 

caaattcagg caagatagtt aattgcattt gaaatacata tttctgctca gagatggtga 2160 

acatccatgc tccgaagctc atattaagtg tggtagctat tttcttttca tctttttggg 2220 

gtgaatgcgt gttcatgtaa ctcgtaaggt gttatatatt acagaagtcg tatacgtcgt 2280 
tccaaaaaaa aa 2292 

<210> 7 
<211> 3536 
<212> DNA 

<213> Lotus japonicus GIFO 
<400> 7 

ggacatgaga ttgaagctcc aaaattagct cttttttctg atgaatactt aatgctttgt 60 

tgtattcact tgattaagtg ctagaaatca tctttgcatg atcatagatt aaatgaattt 120 

ccagttggtg tgtggagagc tattttgtta tgctgacatc tgcaatttgc agggcatcta 180 

atgattgtca cttcttaaat tattattggt tgtttccgtt tctttaatta tctgttttaa 240 

tcttgcaggt catacaaatt aaaatactag ccaccaccca agacatacta aatggggtag 300 

tagagggaag ggtaaggtcg ataaggatga ctttttattc tataaaattt aggagaattt 360 

gagcttaagt ggcaaggcaa acgacattac tatacgaatt ggctttgtac cagaaacagg 420 

gaacaaataa tattttacaa ataagctatt atcatgtcag ctcatttgtt caactttgat 480 

ttgattaaaa attaaatgaa gttgaatttg ttgagctgct ttattatata tgccactgga 540 

tgtttccgca ttctaagtgc atgtttgaaa acatttctac aattgattac gaaggaaaaa 600 

ttaatcatgg agagaagctt atgtgcgtag cttctgtatt tctgaattga ttctatctgt 660 

acagtagcat ttagataatg aatgatcttg gttctcgcta agcatcaaac caatctctac 720 

ccttttaaaa ttgcaagaat tataagtcat gcattgaccc aaatccttct gtggttatgc 780 

cccttaaaaa tccggcaaga catcaagtta gttggtcatt agggttccac cagctagctg 840 

acaccttgta caacaactgg ccgtcctaaa gttgggtaag cattacaata ctaaatgcca 900 



ttttattata ttttgcgcat ggttatatac ctaagtagga tttgtccaca gtttctttga 960 

ttcggaaagg aaaaaatatt tagttgacac tgacagaagc agattttata tacatatatt 1020 

atgaaatgac tcctacatga gatacacgaa tctcatcccc atgagttgca gtttgacaga 1080 

gtacacactt atcaacttgc tggaatatag gaaagtctaa ccaatgatgt cgatccgtat 1140 

tgccttaatt ttggtaaatt tagtattaca tgatcattat tgatatacta aaccacagga 1200 

tattttattg acaatgtgaa tgttccatat tttcaacaat gctgattccc tctgataaag 1260 

aacaagttcc ttttctcttt ccctgttaac tatcatttgt tccccacttc acaaacatgg 1320 

ctgtcttctt tcttacctct ggctctctga gtctttttct tgcactcacg ttgcttttca 1380 

ctaacafccgc cgctcgatca gaaaagatta gcggcccaga cttttcatgc cctgttgact 14 40 

cacctccttc ttgtgaaaca tatgtgacat acacagctca gtctccaaat cttctgagcc 1500 

tgacaaacat atctgatata tttgatatca gtcctttgtc cattgcaaga gccagtaaca 1560 

tagatgcagg gaaggacaag ctggttccag gccaagtctt actggtacct gtaacttgcg 1620 

gttgcgccgg aaaccactct tctgccaata cctcctacca aatccagcta ggtgatagct 1680 

acgactttgt tgcaaccact ttatatgaga accttacaaa ttggaatata gtacaagctt 1740 

caaacccagg ggtaaatcca tatttgttgc cagagcgcgt caaagtagta ttccctttat 1800 

tctgcaggtg cccttcaaag aaccagttga acaaagggat tcagtatctg attacttatg 1860 

tgtggaagcc caatgacaat gtttcccttg tgagtgccaa gtttggtgca tccccagcgg 1920 

acatattgac tgaaaaccgc tacggtcaag acttcactgc tgcaaccaac cttccaattt 1980 

tgatcccagt gacacagttg ccagagctta ctcaaccttc ttcaaatgga aggaaaagca 2040 

gcattcatct tctggttata cttggtatta ccctgggatg cacgttgcta actgcagttt 2100 

taaccgggac cctcgtatat gtatactgcc gcagaaagaa ggctctgaat aggactgctt 2160 

catcagctga gactgctgat aaactacttt ctggagtttc aggctatgta agcaagccaa 2220 

acgtgtatga aatcgacgag ataatggaag ctacgaagga tttcagcgat gagtgcaagg 2280 

ttggggaatc agtgtacaag gccaacatag aaggtcgggt tgtagcggta aagaaaatca 2340 

aggaaggtgg tgccaatgag gaactgaaaa ttctgcagaa ggtaaatcat ggaaatcrgg 2400 

tgaaactaat gggtgtctcc tcaggctatg atggaaactg tttcttggtt tatgaatatg 2460 

Gtgaaaatgg gtctcttgct gagtggctgt tctccaagtc ttcaggaacc ccaaactccc 2520 

ttacatggtc tcaaaggata agcatagcag tggatgttgc tgtgggtctg caatacatgc 2580 

atgaacatac ctatccaaga ataatacaca gggacatcac aacaagtaat atccttctcg 2640 



actcgaactt caaggccaag atagcgaatt tcgccatggc cagaacttcg accaacccca 2700 

tgatgccaaa aatcgatgtc ttcgctttcg gggtgcttct gatagagttg ctcaccggaa 2760 

ggaaagccat gacaaccaag gagaacggcg aggtggttat gctgtggaag gatatgtggg 2820 

agatctttga catagaagag aatagagagg agaggatcag aaaatggatg gatcctaatt 2880 

tagagagctt ttatcatata gataatgctc tcagcttggc atccttagca gtgaattgca 2940 

cagctgataa gtctttgtct cgaccctcca tggctgaaat tgttcttagc ctctcctttc 3000 

tcactcaaca atcatctaac cccacattag agagatcctt gacttcttct gggttagatg 3060 

tagaagatga tgctcatatt gtgacttcca ttactgcacg ttaagcaagg gaaggtaatt 3120 

cagtttctca tcaaattgat caagatgcac tttgtttgcg tggttactat tacattttta 3180 

actagctatt tgcttatttc tctgtattta tttgtcagac actggaattg aatatcatat 3240 

gatggaggag ttgtctgtta atacatgtgc taataacaaa ttcaggcaag atagttaatt 3300 

gcatttgaaa tacatatttc tgctcagaga tggtgaacat ccatgctccg aagctcatat 3360 

taagtgtggt agctattttc ttttcatctt tttggggtga atgcgtgttc atgtaactcg 3420 

taaggtgtta tatattacag aagtcgtata cgtcgttcca ataattgatc aaggtacctg 3480 

tctatttcgt aaaaaaagcc aagtaccaac attagttgac tcgttgagag tggtgc 3536 

<210> 8 
<211> 595 
<212> PRT 

<213> Lotus japonicus 
<400> 8 

Met Ala Val Phe Phe Leu Thr Ser Gly Ser Leu . Ser Leu Phe Leu Ala 
15 10 15 

Leu Thr Leu Leu Phe Thr Asn lie Ala Ala Arg Ser Glu Lys lie Ser 
20 25 30 

Gly Pro Asp Phe Ser Cys Pro Val Asp Ser Pro Pro Ser Cys Glu Thr 
35 40 45 

Tyr Val Thr Tyr Thr Ala Gin Ser Pro Asn Leu Leu Ser Leu Thr Asn 
50 55 60 

He Ser Asp He Phe Asp He Ser Pro Leu Ser He Ala Arg Ala Ser 
65 70 75 80 



Asn He Asp Ala Gly Lys Asp Lys Leu Val Pro Gly Gin Val Leu Leu 



85 



90 



95 



Val Pro Vai Thr Cys Gly Cys Ala Gly Asn His Ser Ser Ala Asn Thr 
100 105 110 



Ser Tyr Gin lie Gin Leu Gly Asp Ser Tyr Asp Phe Val Ala Thr Thr 
115 120 125 



Leu Tyr Glu Asn Leu Thr Asn Trp Asn He Val Gin Ala Ser Asn Pro 
130 135 140 



Gly Val Asn Pro Tyr Leu Leu Pro Glu Arg Val Lys Val Val Phe Pro 
145 150 155 160 



Leu Phe Cys Arg Cys Pro Ser Lys Asn Gin Leu Asn Lys Gly He Gin 
165 170 175 



Tyr Leu He Thr Tyr Val Trp Lys Pro Asn Asp Asn Val Ser Leu Val 
180 185 190 



Ser Ala Lys Phe Gly Ala Ser Pro Ala Asp He Leu Thr Glu Asn Arg 
195 200 205 



Tyr Gly Gin Asp Phe Thr Ala Ala Thr Asn Leu Pro He Leu He Pro 
210 215 220 



Val Thr Gin Leu Pro Glu Leu Thr Gin Pro Ser Ser Asn Gly Arg Lys 
225 230 235 240 



Ser Ser He His Leu Leu Val He Leu Gly He Thr Leu Gly Cys Thr 
245 250 255 



Leu Leu Thr Ala Val Leu Thr Gly Thr Leu Val Tyr Val Tyr Cys Arg 
260 265 270 



Arg Lys Lys Ala Leu Asn Arg Thr Ala Ser Ser Ala Glu Thr Ala Asp 
275 280 285 



Lys Leu Leu Ser Gly Val Ser Gly Tyr Val Ser Lys Pro Asn Val Tyr 
290 295 300 



Glu He Asp Glu He Met Glu Ala Thr Lys Asp Phe Ser Asp Glu Cys 
305 310 315 320 



Lys Val Gly Glu Ser Val Tyr Lys Ala Asn lie Glu Gly Arg Val Val 
325 330 335 



Ala Val Lys Lys He Lys Glu Gly Gly Ala Asn Glu Glu Leu Lys He 
340 345 350 



Leu Glh Lys Val Asn His Gly Asn Leu Val Lys Leu Met Gly Val Ser 
355 360 365 



Ser Gly Tyr Asp Gly Asn Cys Phe Leu Val Tyr Glu Tyr Ala Glu Asn 
370 375 380 



Gly Ser Leu Ala Glu Trp Leu Phe Ser Lys Ser Ser Gly Thr Pro Asn 
385 3^0 395 400 



Ser Leu Thr Trp Ser Gin Arg He Ser He Ala Val Asp Val Ala Val 
405 410 415 



Gly Leu Gin Tyr Met His Glu His Thr Tyr Pro Arg He He His Arg 
420 425 430 



Asp He Thr Thr Ser Asn He Leu Leu Asp Ser Asn Phe Lys Ala Lys 
435 440 445 



He Ala Asn Phe Ala Met Ala Arg Thr Ser Thr Asn Pro Met Met Pro 
450 455 460 



Lys He Asp Val Phe Ala Phe Gly Val Leu Leu He Glu Leu Leu Thr 
465 470 475 480 



Gly Arg Lys Ala Met Thr Thr Lys Glu Asn Gly Glu Val Val Met Leu 
485 490 495 



Trp Lys Asp Met Trp Glu He Phe Asp He Glu Glu Asn Arg Glu Glu 
500 505 510 



hxq He Arg Lys Trp Met Asp Pro Asn Leu Glu Ser Phe Tyr His He 
515 520 525 



Asp Asn Ala Leu Ser Leu Ala Ser Leu Ala Val Asn Cys Thr Ala Asp 
530 535 540 



Lys Ser Leu Ser Arg Pro Ser Met Ala Glu He Val Leu Ser Leu Ser 
545 550 . 555 560 



Phe Leu Thr Gin Gin Ser Ser Asn Pro Thr Leu Glu Arg Ser Leu Thr 
565 570 575 

Ser Ser Gly Leu Asp Val Glu Asp Asp Ala His lie Val Thr Ser lie 
580 585 590 

Thr Ala Arg 
595 



<210> 9 

<211> 23 

<212> DNA 

<213> Pisum sativum 

<400> 9 



atgtctgcct 


tctttcttcc 








<210> 10 
<211> 23 
<212> DNA 

<213> Pisura sativum 








<400> 10 
ccacacataa 


gtaatmagat 


act 


23 




<210> 11 
<211> 3800 
<212> DNA 

<213> Pisum sativum 








<400> 11 
gtgggctata 


tgattggtgc 


gtacttcacc ttgcatgaaa tatcagcaca 


aagtatatca 


60 


agtgaaaaac 


aatacctaaa 


ttccttaacc tatgatattc ttttgggaga 


ggttgcaaaa 


120 


aagttgttag 


ttgcagttat 


tatttgagtt ttgaaaatgt attgttggcc 


aaacattagt 


180 


tgatactcag 


gaactagctc 


ttgttctgat ggatacttaa tgcttcgtta 


tatatttgta 


240 


ttcacttggt 


caagtgctag 


aaatcatctt ggcacaatca caggatgaat 


aaacctctgg 


300 


ttgaaagcta 


cattcagtcg 


tttgctgatt tctgcaactt gaggggaatc 


taatgatttt 


360 


tatttattat 


tattgctgtt 


gcttactgca attatcaatt ccttttaatt 


tttttacaaa 


420 


acaagttggt 


tacaagatct 


ctttaatata ttgttatcag ttatcagttt 


cttttatgta 


480 


agaagggttt 


ctctatacgg 


aactataaag actaatcctt caaatcgggt 


gggacaacaa 


540 


aagcggcaaa 


gttgttcatg 


aagaatttta gcactgttgt attcttatca 


agtacagaaa 


€00 


gccacactca 


agcaaaaaag 


tgtagggtaa gaacgacatc ttattctatt 


ttatttagta 


660 



ggagaagtca agcttatgtg gcgatgtaaa tgtcatttct atccaaacta tctttgtact 720 

agaaataggg aacatataaa ttatggagag tttgttaagg tgttttaata tattaaaacc 780 

attgtaacgg gaagtgtcaa cattgttagc tgttcattgc ctgtatatta taatagcata 640 

tatataatag acttggcctt tgttaaactt taaaccatal: cttttgtgag tctacccctt 900 

aaaaatatgg taaaggcatc aagttagata gtctttaggt accagccagc tagctgacat 960 

tgtgtaagga catattggat tacaaaacta tattattatt accatcttta ttatattctg 1020 

cgcatgattt catacttaat ttggatttgt ccagtgtcta agatttgaaa aggaaaaata 1080 

gtagaactaa tgacagagac agaagcatat atttttaata tcaaaccaaa agatatgtcc 1140 

aaataagaga taaatataaa gtttgaggta taacaataag tcttggttgt tacttgccat 1200 

aagaaactct cttttctctt ccccataact tgcatttctt cacaatttca caacaatggc 1260 

tatcttcttt cttccttcta gttctcatgc cctttttctt gcactcatgt tttttgtcac 1320 

taatatttca gctcaaccat tacaactcag tggaacaaac ttttcatgcc cggtggattc 1380 

acctccttca tgtgaaacct atgtgacata ctttgctcgg tctccaaact ttttgagcct 1440 

aactaacata tcagatatat ttgatatgag tcctttatcc attgcaaaag ccagtaacat 1500 

agaagatgag gacaagaagc tggttgaagg ccaagtctta ctcatacctg taacttgtgg 1560 

ttgcactaga aatcgctatt tcgcgaattt cacgtacaca atcaagctag gtgacaacta 1620 

tttcatagtt tcaaccactt cataccagaa tcttacaaat tatgtggaaa tggaaaattt 1680 

caaccctaat ctaagtccaa atctattgcc accagaaatc aaagttgttg tccctttatt 1740 

ctgcaaatgc ccctcgaaga atcagttgag caaaggaata aagcatctga ttacttatgt 1800 

gtggcaggct aatgacaatg ttacccgtgt aagttccaag tttggtgcat cacaagtgga 1860 

tatgtttact gaaaacaatc aaaacttcac tgcttcaacc aacgttccga ttttgatccc 1920 

tgtgacaaag ttaccggtaa ttgatcaacc atcttcaaat ggaagaaaaa acagcactca 1980 

aaaacctgct tttataattg gtattagcct aggatgtgct tttttcgttg tagttttaac 2040 

actatcactt gtttatgtat attgtctgaa aatgaagaga ttgaatagga gtacttcatt 2100 

ggcggagact gcggataagt tactttcagg tgtttcgggt tatgtaagca agccaacaat 2160 

gtatgaaatg gatgcgatca tggaagctac aatgaacctg agtgagaatt gtaagattgg 2220 

tgaatccgtt tacaaggcta atatagatgg tagagtttta gcagtgaaaa aaatcaagaa 2280 

agatgcttct gaggagctga aaattttgca gaaggtaaat catggaaatc ttgtgaaact 2340 
tatgggtgtg tcttccgaca acgacggaaa ctgtttcctt gtttacgagt atgctgaaaa 2400 
tggatcactt gatgagtggt tgttctcaga gtcgtcgaaa acttcgaact cggtggtctc 2460 



gcttacatgg 


tctcagagaa 


taacagtagc 


agtggatgtt gcagttggtt tgcaatacat 


2520 


gcatgaacat 


acttacccaa 


gaataatcca 


cagagacatc acaacaagta atatccttct 


2580 


ggattcaaac 


tttaaggcca 


agatagcgaa 


tttttcaatg gccagaactt caacaaattc 


2640 


catgatgccg 


aaaatcgatg 


ttttcgcttt 


tggggtggtt ctgattgagt tgcttaccgg 


2700 


caagaaagcg 


ataacaacga 


tggaaaatgg cgaggtggtt attctgtgga aggatttctg 


2760 


gaagatttct 


gatcbagaag 


ggaatagaga 


agagagctta agaaaatgga tggatcctaa 


2820 


gctagagaat 


ttttatccta 


ttgataatgc 


tcttagtttg gcttctttgg cagtgaattg 


2880 


tactgcagat 


aaatcattgt 


caagaccaag 


cattgcagaa attgttcttt gtctttctct 


2940 


tctcaatcaa 


tcatcatctg 


aaccaatgti: 


agaaagatcc ttgacatctg gtttagatgt 


3000 


tgaagctact 


catgttgtta 


cttctatagt 


agctcgttga tattcattca agtgaaggta 


3060 


acactgaatc 


aatgcttcag 


tttcttatat 


tcaagatggt tactttgttt agatgattat 


3120 


tgattacatc 


tttatgtgtg 


gaactatatg 


gttattttaa ttaagggaat tgttctaaaa 


3180 


ttcatttttc 


catgttattc 


ttttacagca 


tgagtttcgg taaagtgaat tgtaacctgc 


3240 


tattgaactc 


agaataattt 


cggttattat gttagtcatc gacactttta agaaaagtat 


3300 


gtttgatgtt 


cgatatatgt 


ctgacaccaa 


cacaacactt acaactgtga ttatgtttaa 


3360 


tttgtttatt 


tttgtgataa 


atcagtgttt 


catcatttga ttattaaggt acaattattc 


3420 


caaccatcct 


ttattaaggg 


cattctcttt 


attttttgat acaatataag acctaagtgt 


3480 


aaatattaaa 




agacatgaat 


tttgcaagaa aggatttgga agcctttggc 


3540 


acccataaaa 


tgttgatgca 


agtcagctat 


aacttctctc tttttctctt tttttttggg 


3600 


atgggatggg 


tattcatgta 


tagctaaagg 


cacattttaa attaaaatct tgtatatata 


3660 


tgcaaaagtc 


ttctttggtg 


tttcaataat 


tgatgaaggg accgcttacc atcgatggtt 


3720 


gagttaacaa 


taccacgtct 


atatatgtgg 


agaatctttc tcaagcatca agacttcgtt 


3780 


ggccagctgc 


taaaagacaa 




3800 





<210> 12 
<211> 2226 
<212> DNA 

<213> Pistan sativum 
<400> 12 

ttttctcttc ctcataactt gcatttcttc acaatttcac aacaatggct atcttctttc 60 
ttccttctag ttctcatgcc ctttttcttg cactcatgtt ttttgtcact aatatttcag 120 
ctcaaccatt acaactcagt ggaacaaact tttcatgccc ggtggattca cctccttcat 180 



gtgaaaccta tgtgacatac tttgctcggt ctccaaactt tttgagccta actaacatat 240 

cagatatatt tgatatgagt cctttatcca ttgcaaaagc cagtaacata gaagatgagg 300 

acaagaagct ggttgaaggc caagtcttac tcatacctgt aacttgtggt tgcactagaa 360 

atcgctattt cgcgaatttc acgtacacaa tcaagctagg tgacaactat ttcatagttt 420 

caaccacttc ataccagaat cttacaaatt atgtggaaat ggaaaatttc aaccctaatc 480 

taagtccaaa tctattgcca ccagaaatca aagttgttgt ccctttattc tgcaaatgcc 540 

cctcgaagaa tcagttgagc aaaggaataa agcatctgat tacttatgtg tggcaggcta 600 

atgacaatgt tacccgtgta agttccaagt ttggtgcatc acaagtggat atgtttactg 660 

aaaacaatca aaacttcact gcttcaacca atgttccgat tttgatccct gtgacaaagt 720 

taccggtaat tgatcaacca tcttcaaatg gaagaaaaaa cagcactcaa aaacctgctt 780 

ttataattgg tattagccta ggatgtgctt ttttcgttgt agttttaaca ctatcacttg 840 

tttatgtata ttgtctgaaa atgaagagat tgaataggag tacttcattg gcggagactg 900 

cggataagtt actttcaggt gtttcgggtt atgtaagcaa gccaacaatig tatgaaatgg 960 

atgcgatcat ggaagctaca atgaacctga gtgagaattg taagattggt gaatctgttt 1020 

acaaggctaa tatagatggt agagttttag cagtgaaaaa aatcaagaaa gatgcttctg 1080 

aggagctgaa aattctgcag aaggtaaatc atggaaatct tgtgaaactt atgggtgtgt 1140 

cttccgacaa cgaaggaaac tgtttccttg tttacgagta tgctgaaaat ggatcacttg 1200 

atgagtggtt gttctcagag tcgtcgaaaa cttcgaactc ggtggtctcg cttacatggt 1260 

ctcagagaat aacagtagca gtggatgttg cagttggttt gcaatacatg catgaacata 1320 

cttacccaag aataatccac agagacatca caacaagtaa tatccttctg gattcaaact 1380 

ttaaggccaa gatagcgaat ttttcaatgg ccagaacttc aacaaattcc atgatgccga 1440 

aaatcgatgt tttcgctttt ggggtggttc tgattgagtt gcttaccggc aagaaagcga 1500 

taacaacgat ggaaaatggc gaggtggtta ttctgtggaa ggatttctgg aagatttttg 1560 

atctagaagg gaatagagaa gagagcttaa gaaaatggat ggatcctaag ctagagaatt 1620 

tttatcctat tgataatgct cttagtttgg cttctttggc agtgaattgt actgcagata 1680 

aatcattgtc aagaccaagc attgcagaaa ttgttctttg tctttctctt ctcaatcaat 1740 
catcatctga accaatgtta gaaagatcct tgacatctgg tttagatgtt gaagctactc 1800 
atgttgttac ttctatagta gctcgttgat attcattcaa gtgaaggtaa cactaaatca 1860 
atgcttcagt ttcttatatt caagatggtt actttgttta ggtgattatt gattacatct 1920 



ctatgtgtgg aactatatgg ttatbttaat taagggaatt agtctaaatt tcatttttcc 1980 

atgttattct ttaaagcacg agtttcggta aagtgaattg taacctgtta ttgagctcat 2040 

aataatttca gttattatgt tagtcatcga cacttctaaa aaagtatgtc tgatgttcga 2100 

tatgtgtctg acaccaacac aaccctgacc actgtgatta cgtttaattt gtttattttt 2160 

gtgataaatc agtgtttcab catttgatta ttaaggtaca attattccaa ccatcctttt 2220 
aaaaaa 2226 

<210> 13 
<211> 1968 
<212> DNA 

<213> Pisum sativum 
<400> 13 

cttgcatttc ttcacaattt cacaacaatg gctatcttct ttcttccttc tagttctcat 60 

gccctttttc ttgcactcat gttttttgtc actaatattt cagctcaacc attacaactc 120 

agtggaacaa acttttcatg cccggtggat tcacctcctt catgtgaaac ctatgtgaca 180 

tactttgctc ggtctccaaa ctttttgagc ctaactaaca tatcagatat atttgatatg 240 

agtcctttat ccattgcaaa agccagtaac atagaagatg aggacaagaa gctggttgaa 300 

ggccaagtct tactcatacc tgtaacttgt ggttgcacta gaaatcgcta tttcgcgaat 360 

ttcacgtaca caatcaagct aggtgacaac tacttcatag tttcaaccac ttcataccag 420 

aatcttacaa attatgtgga aatggaaaat ttcaacccta atctaagtcc aaatctattg 480 

ccaccagaaa tcaaagttgt tgtcccttta ttctgcaaat gcccctcgaa gaatcagttg 540 

agcaaaggaa taaagcatct gattacttat gtgtggcagg ctaatgacaa tgttacccgt 600 

gtaagttcca agtttggtgc atcacaagtg gatatgttta ctgaaaacaa tcaaaacttc 660 

actgcttcaa ccaacgttcc gattttgatc cctgtgacaa agttaccggt aattgatcaa 720 

ccatcttcaa atggaagaaa aaacagcact caaaaacctg cttttataat tggtattagc 780 

ctaggatgtg cttttttcgt tgtagtttta acactatcac ttgtttatgt atattgtctg 840 

aaaatgaaga gattgaatag gagtacttca ttggcggaga ctgcggataa gttactttca 900 

ggtgtttcgg gttatgtaag caagccaaca atgtatgaaa tggatgcgat catggaagct 960 

acaatgaacc tgagtgagaa ttgtaagatt ggtgaatccg tttacaaggc taatatagat 1020 

ggtagagttt tagcagtgaa aaaaatcaag aaagatgctt ctgaggagct gaaaattttg 1080 

cagaaggtaa atcatggaaa tcttgrgaaa cttatgggtg tgtcttccga caacgacgga 1140 

aactgtttcc ttgtttacga gtatgctgaa aatggatcac ttgatgagtg gttgttctca 1200 



gagtcgtcga 


aaacttcgaa 


ctcggtggtc 


tcgcttacat 


ggtctcagag 


aataacagta 


1260 


gcagtggatg 


ttgcagttgg 


tttgcaatac 


atgcatgaac 


atacttaccc 


aagaataatc 


1320 


cacagagaca 


tcacaacaag 


taatatcctt 


ctggattcaa 


actttaaggc 


caagatagcg 


1380 


aatctttcdd 


tggccagaac 


ttcaacaaat 


tccatgatgc 


cgaaaatcga 


tgttttcgct 


1440 


tttggggtgg 


ttctgattga 


gttgcttacc 


ggcaagaaag 


cgataacaac 


gatggaaaat 


1500 


ggcgaggtgg 


ttattctgtg 


gaaggatttc 


tggaagattt 


ttgatctaga 


agggaataga 


1560 


gaagagagct 


taagaaaatg 


gatggatcct 


aagctagaga 


atttttatcc 


tattgataat 


1620 


gctcttagtt 


tggcttcttt 


ggcagtgaat 


tgtactgcag 


ataaatcatt 


gtcaagacca 


1680 




aaacngctcc 


ttgtctttct 


cttctcaatc 


aatcatcatc 


tgaaccaatg 


1740 


ttagaaagat 


ccttgacatc 


tggtttagat 


gttgaagcta 


ctcatgttgt 


tacttctata 


1800 


gtagctcgtt 


gatattcatt 


caagtgaagg 


taacactgaa 


tcaatgcttc 


agtttcttat 


1860 


atccaagatg 


gttactttgt 


ttagatgatt 


attgattaca 


tctttatgtg 


tggaactata 


1920 


tggttatttt 


aattaaggga 


attgttctaa 


aattcatttt 


tccatgtt 


1968 


<210> 14 
<211> 1938 
<212> DNA 

<213> Pisum sativum 












<400> 14 
tcttcacaat 


ttcacaacaa 


tggctatctt 


ctttcttcct 


tctagttctc 


atgccctttt 


60 


tcttgcactc 


atgttttttg 


tcactaatat 


ttcagctcaa 


ccattacaac 


tcagtggaac 


120 


aaacttttca 


tgcccggtgg 


attcacctcc 


ttcatgtgaa 


acctatgtga 


catactttgc 


180 


tcggtctcca 


aactttttga 


gcctaactaa 


catatcagat 


atatttgata 


tgagtccttt 


240 


atcpattgca 


aaagccagta 


acatagaaga 


tgaggacaag 


aagctggttg 


aaggccaagt 


300 


cttactcata 


cctgtaactt 


gtggttgcac 


tagaaatcgc 


tatttcgcga 


atttcacgta 


360 


cacaatcaag 


ctaggtgaca 


actatttcat 


agtttcaacc 


acttcatacc 


agaatcttac 


420 


aaattatgtg 


gaaatggaaa 


atttcaaccc 


taatctaagt 


ccaaatctat 


tgccaccaga 


480 


aatcaaagtt 


gttgtccctt 


tattctgcaa 


atgcccctcg 


aagaatcagt 


tgagcaaagg 


540 


aataaagcat 


ctgattactt 


atgtgtggca 


ggctaatgac 


aatgttaccc 


gtgtaagttc 


600 


caagtttggt 


gcatcacaag 


tggatatgtt 


tactgaaaac 


aatcaaaact 


tcactgcttc 


660 


aaccaatgtt 


ccgattttga 


tccctgtgac 


aaagttaccg 


gtaattgatc 


aaccatcttc 


720 


aaatggaaga 


aaaaacagca 


ctcaaaaacc 


tgcttttata 


attggtatta 


gcctaggatg 


780 



tgcttttttc 


gttgtagttt 


taacactatc acttgtttat gtatattgtc 


tgaaaatgaa 


840 


gagattgaat 


aggagtactt 


cattggcgga gactgcggat aagttacttt 


caggtgtttc 


900 


gggttatgta 


agcaagccaa 


caatgtatga aatggatgcg atcatggaag 


ctacaatgaa 


960 


cctgagtgag 


aattgtaaga 


ttggtgaatc tgtttacaag gctaatatag 


atggtagagt 


1020 


tttagcagtg 


aaaaaaatca 


agaaagatgc ttctgaggag ctgaaaattc 


tgcagaaggt 


1080 


aaatcatgga 


aatcttgtga 


aacttatggg tgtgtcttcc gacaacgaag 


gaaactgttt 


1140 


ccttgtttac 


gagtatgctg 


aaaatggatc acttgatgag tggttgttct 


cagagttgtc 


1200 


gaaaacttcg 


aactcggtgg 


tctcgcttac atggtctcag agaataacag 


Cagcagtgga 


1260 


tgttgcagtt 


ggtttgcaat 


acatgcatga acatacttac ccaagaataa 


tccacagaga 


1320 


catcacaaca 


agtaatatcc 


ttctggattc aaactttaag gccaagatag 


cgaatttttc 


1380 


aatggccaga 


acttcaacaa 


attccatgat gccgaaaatc gatgttttcg 


cttttggggt 


1440 


ggttctgatt 


gagttgctta 


ccggcaagaa agcgataaca acgatggaaa 


atggcgaggt 


1500 


ggttattctg 


tggaaggatt 


tctggaagat ttttgatcta gaagggaata 


gagaagagag 


1560 


cttaagaaaa 


tggatggatc 


ctaagctaga gaatttttat cctattgata 


atgctcttag 


1620 


tttggcttct 


ttggcagtga 


attgtactgc agataaatca ttgtcaagac 


caagcattgc 


1680 


agaaattgtt 


ctttgtcttt 


ctcttctcaa tcaatcatca tctgaaccaa 


tgttagaaag 


1740 


atccttgaca 


tctggtttag 


atgttgaagc tactcatgtt gttacttcta 


tagtagctcg 


1800 


ttgatattca 


ttcaagtgaa 


ggtaacacta aatcaatgct tcagtttctt 


atattcaaga 


1860 


Cggttacttt 


gtttaggtga 


ttattgatta catctttatg tgtggaacta 


tatggttatt 


1920 


ttaattaagg 


gaattagt 




1938 





<210> 15 
<211> 594 
<212> PRT 

<213> Pisum sativum 
<400> 15 

Met Ala lie Phe Phe Leu Pro Ser Ser Ser His Ala Leu Phe Leu Aia 
is 10 15 

Leu Met Phe Phe Val Thr Asn lie Ser Ala Gin Pro Leu Gin Leu Ser 
20 25 30 

Gly Thr Asn Phe Ser Cys Pro Val Asp Ser Pro Pro Ser Cys Glu Thr 
35 40 45 



Tyr Val Thr Tyr Phe Ala Arg Ser Pro Asn Phe Leu Ser Leu Thr Asn 
50 55 60 



He Ser Asp He Phe Asp Met Ser Pro Leu Ser He Ala Lys Ala Ser 
^5 70 75 80 

Asn He Glu Asp Glu Asp Lys Lys Leu Val Glu Gly Gin Val Leu Leu 
85 90 95 

He Pro Val Thr Cys Gly Cys Thr Arg Asn Arg Tyr Phe Ala Asn Phe 
100 105 110 

Thr Tyr Thr He Lys Leu Gly Asp Asn Tyr Phe He Val Ser Thr Thr 
H5 120 125 

Ser Tyr Gin Asn Leu Thr Asn Tyr Val Glu Met Glu Asn Phe Asn Pro 
130 135 140 

Asn Leu Ser Pro Asn Leu Leu Pro Pro Glu He Lys Val Val Val Pro 
150 155 160 

Leu Phe Cys Lys Cys Pro Ser Lys Asn Gin Leu Ser Lys Gly He Lys 
165 170 175 

His Leu He Thr Tyr Val Trp Gin Ala Asn Asp Asn Val Thr Arg Val 
180 185 190 

Ser Ser Lys Phe Gly Ala Ser Gin Val Asp Met Phe Thr Glu Asn Asn 
195 200 205 

Gin Asn Phe Thr Ala Ser Thr Asn Val Pro He Leu He Pro Val Thr 
210 215 220 

Lys Leu Pro Val He Asp Gin Pro Ser Ser Asn Gly Arg Lys Asn Ser 
225 230 235 240 

Thr Gin Lys Pro Ala Phe He He Gly He Ser Leu Gly Cys Ala Phe 
245 250 255 



Phe Val Val Val Leu Thr Leu Ser Leu Val Tyr Val Tyr Cys Leu Lys 
260 265 270 



Met Lys Arg Leu Asn Arg Ser Thr Ser Leu Ala Glu Thr Ala Asp Lys 
275 280 285 



Leu Leu Ser Gly Val Ser Gly Tyr Val Ser Lys Pro Thr Met Tyr Glu 
290 295 300 



Met Asp Ala He Met Glu Ala Thr Met Asn Leu Ser Glu Asn Cys Lys 
305 310 315 320 



He Gly Glu Ser Val Tyr Lys Ala Asn He Asp Gly Arg Val Leu Ala 
325 330 335 



Val Lys Lys He Lys Lys Asp Ala Ser Glu Glu Leu Lys He Leu Gin 
340 345 350 



Lys Val Asn His Gly Asn Leu Val Lys Leu Met Gly Val Ser Ser Asp 
355 360 365 



Asn Asp Gly Asn Cys Phe Leu Val Tyr Glu Tyr Ala Glu Asn Gly Ser 
370 375 380 



Leu Asp Glu Trp Leu Phe Ser Glu Ser Ser Lys Thr Ser Asn Ser Val 
385 390 395 400 



Val Ser Leu Thr Trp Ser Gin Arg He Thr Val Ala Val Asp Val Ala 
405 410 415 



Val Gly Leu Gin Tyr Met His Glu His Thr Tyr Pro Arg He He His 
420 425 430 



Arg Asp He Thr Thr Ser Asn He Leu Leu Asp Ser Asn Phe Lys Ala 
435 440 445 



Lys He Ala Asn Phe Ser Met Ala Arg Thr Ser Thr Asn Ser Met Met 
450 455 460 



Pro Lys He Asp Val Phe Ala Phe Gly Val Val Leu He Glu Leu Leu 
465 470 475 480 



Thr Gly Lys Lys Ala He Thr Thr Met Glu Asn Gly Glu Val Val He 
485 490 495 



Leu Trp Lys Asp Phe Trp Lys He Phe Asp Leu Glu Gly Asn Arg Glu 
500 505 510 



Glu Ser Leu Arg Lys Trp Met Asp Pro Lys Leu Glu Asn Phe Tyr Pro 



515 520 525 



He Asp Asn Ala Leu Ser Leu Ala Ser Leu Ala Val Asn Cys Thr Ala 
530 535 540 



Asp Lys Ser Leu Ser Arg Pro Ser He Ala Giu He Val Leu Cys Leu 
545 550 555 560 



Ser Leu Leu Asn Gin Ser Ser Ser Giu Pro Met Leu Glu Arg Ser Leu 
565 570 575 



Thr Ser Gly Leu Asp Val Glu Ala Thr His Val Val Thr Ser He Val 
580 585 590 



Ala Arg 



<210> 16 
<211> 19 
<212> DMA 

<213> Lotus japonicus 
<400> 16 

tgcatttgca tggagaacc 19 

<210> 17 
<211> . 20 
<212> DNA 

<213> Lotus japonicus 
<400> 17 

tttgctgtga cattatcagc 20 

<210> 18 
<211> 20 
<212> DNA 

<213> Lotus japonicus 
<400> 18 

ttgcagattg cacaactagg 20 

<210> 19 

<211> 21 

<212> DNA 

<213> Lotus japonicus 



<400> 19 

acttagaatc tgcaactttg c 



21 



<210> 20 
<211> 21 
<212> DNA 

<213> Lotus japonicus 
<400> 20 

acttagaatc tgcaactttg c 21 

<210> 21 
<211> 2205 
<212> DNA 

<213> Lotus japonicus 
<400> 21 

aagtgtgaca ttagtttcaa gagaaaaata aatgatcaaa acctggtaga gagtcctaga 60 

aattcaatgt tctgatttct ttcattcatc tctgctgcca ttttgatttg cacaatgaag 120 

ctaaaaactg gtctactttt gtttttcatt cttttgctgg ggcatgtttg tttccatgtg 180 

gaatcaaact gtctgaaggg gtgtgatcta gctttagctt cctattatat cttgcctggt 240 

gttttcatct tacaaaacat aacaaccttt atgcaatcag agattgtctc aagtaatgat 300 

gccataacca gctacaacaa agacaaaatt ctcaatgata tcaacatcca atcctttcaa 360 

agactcaaca ttccatttcc atgtgactgt attggtggtg agtttctagg gcatgtattt 420 

gagtactcag cttcaaaagg agacacttat gaaactattg ccaacctcta ctatgcaaat 480 

ttgacaacag ttgatctttt gaaaaggttc aacagctatg atccaaaaaa catacctgtt 540 

aatgccaagg ttaatgtcac tgttaattgt tcttgtggga acagccaggt ttcaaaagat 600 

tatggcttgt ttattaccta tcccattagg cctggggata cactgcagga tattgcaaac 660 

cagagtagtc ttgatgcagg gttgatacag agtttcaacc caagtgtcaa tttcagcaaa 720 

gatagtggga tagctttcat tcctggaaga tataaaaatg gagtctatgt tcccttgtac 780 

cacagaaccg caggtctagc tagtggtgca gctgttggta tatctattgc aggaaccttc 840 

gtgcttctgt tactagcatt ttgtatgtat gttagatacc agaagaagga agaagagaaa 900 

gctaaattgc caacagatat ttctatggcc ctttcaacac aagatgcctc tagtagtgca 960 

gaatatgaaa cttctggatc cagtgggcca gggactgcta gtgctacagg tcttactagc 1020 

attatggtgg cgaaatcaat ggagttctca tatcaggaac tagcgaaggc tacaaataac 1080 

tttagcttgg ataataaaat tggtcaaggt ggatttggag ctgtctatta tgcagaattg 1140 

agaggcaaga aaacagcaat taagaagatg gatgtacaag catcaacaga atttctttgt 1200 

gagttgaagg tcttaacaca tgttcaccac ttgaatctgg tgcgcttgat tggatactgc 1260 

gttgagggat ctctattcct tgtttatgaa catattgaca atggaaactt aggccaatat 1320 



ttgcatggtt 


caggtaaaga 


accattgcca tggtctagcc gagtacaaat agctctagat 


1380 


gcagcaagag 


gccttgaata cattcatgag cacactgtgc ctgtgtatat ccatcgcgat 


1440 


gtgaaatctg 


caaacatatt 


gatagataag aacttgcgtg gaaaggttgc agattttggc 


1500 


ttgaccaagc 


ttattgaagt 


tgggaactcc acactacaaa ctcgtctggt gggaacattt 


1560 


ggatacatgc 


ccccagaata 


tgctcaatat ggtgatattt ctccaaaaat agatgtatat 


1620 


gcatttggag 


ttgttctttt 


tgaacttatt tctgcaaaga atgctgttct gaagacaggt 


1680 


gaattagttg 


ctgaatcaaa 


gggccttgta gctttgtttg aagaagcact taataagagt 


i740 


gatccttgtg 


atgctcttcg caaactggfcg gatcctaggc ttggagaaaa ctatccaatt 


1800 


gattctgttc 


tcaagattgc acaactaggg agagcttgta caagagataa tccactgcta 


1860 


agaccaagta 


tgagatcttt 


agttgttgct cttatgaccc tttcatcact tactgaggat 


1920 


tgtgatgatg 


aatcttccta 


cgaaagtcaa actctcataa atttactgtc tgtgagataa 


1980 


aggttctcca 


tgcaaatgca 


tgtttgttat atatatcttg tagtacaact aagcagacaa 


2040 


aaagttttgt 


actttgaatg 


taaatcgagt cagggtgttt acattttatt actccaatgt 


2100 


ttaattgcca 


aaaccatcaa 


aaagtcctag gccagacttc ctgtaattat atttagcaaa 


2160 


gttgcagatt 


ctaagttcag 


tttttttaaa aaaaaaaaaa aaaaa 2205 



<210> 22 

<211> 2210 

<212> DNA 

<213> Lotus japonicus 

<400> 22 



aagtgtgaca 


ttagtttcaa gagaaaaata aatgatcaaa acctggtaga gagtcctaga 


60 


aattcaatgt 


tctgatttct ttcattcatc tctgctgcca ttttgatttg 


cacaatgaag 


120 


ctaaaaactg 


gtctactttt gtttttcatt cttttgctgg ggcatgtttg 


tttccatgtg 


180 


gaatcaaact 


gtctgaaggg gtgtgatcta gctttagctt cctattatat cttgcctggt 


240 


gttttcatct 


tacaaaacat aacaaccttt atgcaatcag agattgtctc 


aagtaatgat 


300 


gccataacca 


gctacaacaa agacaaaatt ctcaatgata tcaacatcca 


atcctttcaa 


360 


agactcaaca 


ttccatttcc atgtgactgt attggtggtg agtttctagg 


gcatgtattt 


420 


gagtactcag 


cttcaaaagg agacacttat gaaactattg ccaacctcta 


ctatgcaaat 


480 


ttgacaacag 


ttgatctttt gaaaaggttc aacagctatg atccaaaaaa 


catacctgtt 


540 


aatgccaagg 


ttaatgtcac tgttaattgt tcttgtggga acagccaggt 


ttcaaaagat 


600 


tatggcttgt 


ttattaccta tcccattagg cctggggata cactgcagga 


tattgcaaac 


660 



cagagtagtc ttgatgcagg gttgatacag agtttcaacc caagtgtcaa tttcagcaaa 720 

gatagtggga tagctttcat tcctggaaga tataaaaatg gagtctatgt tcccttgtac 780 

cacagaaccg caggtctagc tagtggtgca gctgttggta tatctattgc aggaaccttc 840 

gtgcttctgt tactagcatt ttgtatgtat gttagatacc agaagaagga agaagagaaa 900 

gctaaattgc caacagatat ttctatggcc ctttcaacac aagatggtaa tgcctctagt 960 

agtgcagaat atgaaacttc tggatccagt gggccaggga ctgctagtgc tacaggtctt 1020 

actagcatta tggtggcgaa atcaatggag ttctcatatc aggaactagc gaaggctaca 1080 

aataacttta gcttggataa taaaattggt caaggtggat ttggagctgt ctattatgca 1140 

gaattgagag gcaagaaaac agcaattaag aagatggatg tacaagcatc aacagaattt 1200 

ctttgtgagt tgaaggtctt aacacatgtt caccacttga atctggtgcg cttgattgga 1260 

tactgcgttg agggatctct attccttgtt tatgaacata ttgacaatgg aaacttaggc 1320 

caatatttgc atggttcagg taaagaacca ttgccatggt ctagccgagt acaaatagct 1380 

ctagatgcag caagaggcct tgaatacatt catgagcaca ctgtgcctgt gtatatccat 1440 

cgcgatgtga aatctgcaaa catattgata gataagaact tgcgtggaaa ggttgcagat 1500 

tttggcttga ccaagcttat tgaagttggg aactccacac tacaaactcg tctggtggga 1560 

acatttggat acatgccccc agaatatgct caatatggtg atatttctcc aaaaatagat 1620 

gtatatgcat ttggagttgt tctttttgaa cttatttctg caaagaatgc tgttctgaag 1680 

acaggtgaat tagttgctga atcaaagggc cttgtagctt tgtttgaaga agcacttaat 1740 

aagagtgatc cttgtgatgc tcttcgcaaa ctggtggatc ctaggcttgg agaaaactat 1800 

ccaattgatt ctgttctcaa gattgcacaa ctagggagag cttgtacaag agataatcca 1860 

ctgctaagac caagtatgag atctttagtt gttgctctta tgaccctttc atcacttact 1920 

gaggattgtg atgatgaatc ttcctacgaa agtcaaactc tcataaattt actgtctgtg 1980 

agataaaggt tctccatgca aatgcatgtt tgttatatat atcttgtagt acaactaagc 2040 

agacaaaaag ttttgtactt tgaatgtaaa tcgagtcagg gtgtttacat tttattactc 2100 

caatgtttaa ttgccaaaac catcaaaaag tcctaggcca gacttcctgt aattacattt 2160 
agcaaagttg cagattctaa gttcagtttt tttaaaaaaa aaaaaaaaaa 2210 

<210> 23 
<211> 10253 
<212> DNA 

<213> Lotus japonicus Gifu 
<220> 



<221> exon 

<222> (4172) . . (4808) 

<223> 



<220> 

<221> Intron 

<222> (4809) . • (5280) 

<223> 



<220> 

<221> exon 

<222> (5281) • . (5314) 

<223> 



<220> 

<221> Intron 
.<222> (5315) . - (5561) 
<223> 



<220> 

<221> exon 

<222> (5562) . . (5569) 

<223> 



<220> 

<221> Intron 

<222> (5570) . . (5685) 

<223> 



<220> 

<221> exon 

<222> (5686) . • (5838) 

<223> 



<220> 

<221> Intron 

<222> (5839) . . (6475) 

<223> 



<220> 

<221> exon 

<222> (6476) . . (6678) 

<223> 



<220> 

<221> Intron 

<222> (6679) . . (7105) 

<223> 



<220> 

<221> exon 

<222> (7106) (7195) 

<223> 



<220> 

<221> Intron 

<222> (7196).. {7933) 

<223> 



<220> 

<22l> exon 

<222> (7934) . • (8027) 

<223> 



<220> 

<221> Intron 

<222> (8028) . . (8232) 

<223> 



<220> 

<221> exon 

<222> (8233) . . (8384) 

<223> 



<220> 

<221> Intron 

<222> (8385) - • (8471) 

<223> 



<220> 

<221> exon 

<222> (8472) . . (8563) 

<223> 



<220> 

<221> Intron 

<222> (8564 ).• (9137) 

<223> 



<220> 

<22l> exon 

<222> (9138) . . (9275) 

<223> 



<220> 

<221> Intron 

<222> (9276) . . (9403) 

<223> 



<220> 

<221> exon 

<222> (9404) . . (9502) 

<223> 



<220> 

<221> Intron 

<222> (9503) . . (9694) 

<223> 



<220> 

<221> exon 

<222> (9695) . . (9859) 

<223> 



<400> 23 



gcatgcatat 


agctctattt 


ctttagtaat gttacacctg 


cacgatgtgc 


ataataatag 


60 


aagacataat 


acatatacag 


attaaaatta aataaacaat 


ttctaatcaa 


atttaaaaat 


120 


gtcaacttaa 


tttcattatt 


aaaatataac aatatgaata 


accaaaaata 


aattaagaca 


180 


ttcacccccc 


cccccccgaa 


aagaaattta agacaattac 


aattttttgg 


tatatatatt 


240 


aaagacttcc 


aattatggac 


ataggatctc aacttagtaa 


tcttcacttt 


aggaaagtct 


300 


tttccccaca 


agtcacaacc 


atctattaat atcaatacaa 


aatgaagaca 


actcaataaa 


360 


aagatccttt 


tataggaaat 


tgacgaataa aactgatata 


tatttcagtt 


aaaattgttc 


420 


aaacattagt 


gcaatggaca 


gaagtatcct ttgtgccctc 


atttgccaac 


aactggctca 


480 


tcaagcaata 


aattaattcg ccatttccaa acttttgcag 


ttttaagtag 


aagatatcca 


540 


ttcgttgaaa 


ctttcttcac 


accaccaatt tcctcctaaa 


tgggttaaca 


aatgtgcaat 


600 


gaccgaaata 


tatagttgaa 


acgatcaaga tcctctcaat 


ggtaaaagaa 


tttgaccacg 


660 


ctaagbttta 


ttatctcact 


agctattaat ttaattatca 


tttatctttt 


caattattaa 


720 


acacacaaat 


aatcaatcct 


aaaatgatga aacttgacat 


ggtctatttt 


tacaataact 


780 


taaccaaaaa 


cttataagtt 


agcaactttc aaaaacaggt 


tttcccttgt 


taagaataga 


840 


caaatcaaat 


ggagtgtgtt 


aaatattgtg tttaaaatag tgtgttgcaa 


gcatttctct 


900 


tataaaaaat 


cagtataaat 


atgtttggaa ctgtttattt 


tagtttatct 


tatattataa 


960 


atacaaaaca 


agtgtttggt 


aaagctaatg aaaataactt 


aaaacatacc 


tatggtttgt 


1020 


cggatgtgtc gcgggtggag ctccttgcca ttttgtgtgg 


cctttgtatg 


tgttgagata 


1080 


tggccatggg 


ttattgtaga 


atccttcttc ttttggattc ttctttgttc ctgtctctca 


1140 


ttcaacgctc atgttaccca 


tttcatccat gccaagtttt 


ttttatacat 


gcatctaaat 


1200 



tttttccgcc atatcttaat tttgttttta ttaaaattaa aaagaatatg attgaatgtc 1260 
aatgtaaatt ttttttacac agacaatgca tatccattaa ggtttgttag aattacactc 1320 
caccccattt ttatctaaaa tctacatccc accccatttt atatagaggc aaatttagtg 1380 
acgaaaaata ttcttcatta ataattagtt attatttaaa ctgttaatca ataatttcaa 1440 
aaaaaaattc aatataatcc aataaattta aaaatgaaaa catcacaatc ctcctcattc 1500 
tctcaatcgc gttttacctc cgtaaattta caatgcaaat tatgcaatag cacacctgcc 1560 
cgatttacaa accatatttc gaacatagtg aaacatgctt gtgttttcat atttggtgat 1620 
aattcaattt taatcaaaat aatctcttta tacctccaat tttcaaaatt gggttgtagg 1680 
ccaaaaaagc aacacaaatg ggtgaagaaa atagagaaac aaaattatga aaatatgaag 17 40 
tggatctgag gttattagag cccaacgagg cggtggagca tcgtttfctaa caaaatccaa 1800 
caatatcttt aggggtgaaa tccaccaacc gagcgttcgc tcacaactta aaggggtgaa 1860 
attcacaaga agtagtttaa aggggtgaaa ttcacaagaa gtagttgaac aagtgacttt 1920 
aggaatgtgc gattcacgtt ctggggttca ggtcgcaaca aaactttgag gcatggtggt 1980 
gccatgtggt ttcacattgt gggacagtgg agccgtgtta aagggagtaa aggcttggtg 2040 
gtggccgttt gtggtgaaaa atatgatttg gagatagatg ggacgtggac ttaacagaca 2100 
gagtggatgg ttttttttta aatttaattc agtaaaatta tttttataaa ttaatgtatg 2160 
atagtgtata tgcattaaat ttatttaaat ttttactaat tagtaatttg tttttagtag 2220 
tgacgaattt gtttttgtca ctaaattttg cctttataaa aaatggggtg gagtgtagat 2280 
ttttaaaaag aatggggtgg agtgtcattc ttgcaaatct tgaggggggt gagtgtattt 2340 
tactcaactc ttaaaaaaat taggaattaa ttagttgtaa attataaaag tttatttcat 2400 
tgaataacat aacaaattaa aggcaaaaaa atacaaaact tcattttata tgtatttcag 2460 
aaaaattgcc tactttcaat tatgagaaac taaaattatg tttagtttaa aatgagcata 2520 
gattcaaaaa ttaataaata atatatatag caggatacat gcctatcaat taacatatcg 2580 
tttgtccacg atgatgatct tattggagga tcaatatctt caaattaaca aagttatcac 2640 
ttggctctta ttggtcataa tgcaataaaa aaattgcaat tagtatcaaa tcaaactgaa 2700 
atttgcaact atatgctgct ggtgttgtcg cgcagattcc tttttgattt ttatgggaat 2760 
gaagtcaatg aagcaacagt ttcacaggcg tgcttaaaaa taaaaaaatt ggaaatttga 2820 
tgtttgttag gattatgaga ggacacaatg ggaggatgtt tcacaagctg cagacagggt 2880 
tgccacttca gatgcaaagg attaaataaa caaagccaag gtttgcaatc aacaagattc 2940 



catcgtcgtt ttgcttcctt taatcgtatt aatcaaaagc acaccaagta aagcatcaat 3000 

atataacatc caagaaatca caacatgata gttgcccgtc tcgtctatta actatgatgt 3060 

caggagttcg atccccgctc atgtgaatgg aagacatttc gttgttagat gtttaccgtt 3120 

taatgcaaat actcgcggcg agatddtdag tcattgttgt gggcgaatac cctaaaataa 3180 

gaataaaatt aaatatagca tccaagttat tgcccaaata tataaacaat ggtattgttg 3240 

acattattag gcataaaagc agtaggtaag tgtattatat ttatttaatt ttttaaaatt 3300 

ttgaaattaa ttaataattg ttaacataag taaaccattt ttagcaaaaa ctctacactt 3360 

ctattacctt aacaagtaca tttttgatgg tacaccttaa caattaacaa gtcatatgat 3420 

tgacaaacat attttatatg ctttacaatt tattctaaaa tcaaagttta tgggaagaag 3480 

ctcataaaag tagttcctgg gtgtttttta gaatagagaa gttgatcatg ttagaaatta 3540 

agttaaaaat gagttgaaag tgatttatgt ttgattatat ttatgagaaa aatgaattgt 3600 

ctgatgtaat attgtaaaat ctaacaatta attaagtacc acagaaacta gaatttatag 3660 

cttcaccfcta gaattgattt tggagttaaa atcaattatt aaaggagcaa ttattaaagg 3720 

agacatccaa atacactagt taattttgac aatcaattct aacacttgca aatgtgtaac 3780 

caaacttact atcagtaagt gaactaatga ttcccaagtc aacttttgtt ctagctagcc 3840 

aaccgttact atgttccctc cacaatacat tctccttgaa actgtcaagt gtcaactgca 3900 

cccaaacatc cttgtttgtg atgaaaagat cgaaaacgtg tgcttatgaa tttdcatgtt 3960 

tacattcacc aaaaatcaaa agttacacct ctatacttat cacatatgtt tgagtcactt 4020 

tccatataaa atcccatagt ctattaatta tcagagtaag tgtgacatta gtttcaagag 4080 

aaaaataaat gatcaaaacc tggtagagag tcctagaaat tcaatgttct gatttctttc 4140 

attcatctct gctgccattt tgatttgcac a atg aag eta aaa act ggt eta 4192 

Met Lys Leu Lys Thr Gly Leu 
1 5 

ctt ttg ttt ttc att ctt ttg ctg ggg cat gtt tgt ttc cat gtg gaa 4240 
Leu Leu Phe Phe He Leu Leu Leu Gly His Val Cys Phe His Val Glu 
10 15 20 

tea aac tgt ctg aag ggg tgt gat eta get tta get tee tat tat ate 4288 
Ser Asn Cys Leu Lys Gly Cys Asp Leu Ala Leu Ala Ser Tyr Tyr He 
25 30 35 

ttg cet ggt gtt ttc ate tta caa aac ata aca ace ttt atg caa tea 4336 
Leu Pro Gly Val Phe He Leu Gin Asn He Thr Thr Phe Met Gin Ser 
40 45 50 55 

gag att gtc tea agt aat gat gcc ata acc age tac aac aaa gae aaa 4384 
Glu He Val Ser Ser Asn Asp Ala He Thr Ser Tyr Asn Lys Asp Lys 
60 65 70 



att etc aat gat ate aac ate caa tec ttt caa aga etc aac att cca 4432 
He Leu Asn Asp He Asn He Gin Ser Phe Gin Arg Leu Asn He Pro 
75 80 • 85 

ttt cca tgt gac tgt att ggt ggt gag ttt eta ggg cat gta ttt gag 4480 
Phe Pro Cys Asp Cys He Gly Gly Glu Phe Leu Gly His Val Phe Glu 
90 95 100 

tac tea get tea aaa gga gac act tat gaa act att gee aac etc tac 4528 
Tyr Ser Ala Ser Lys Gly Asp Thr Tyr Glu Thr He Ala Asn Leu Tyr 
105 110 115 

tat gca aat ttg aca aca gtt gat ctt ttg aaa agg ttc aac age tat 4576 
Tyr Ala Asn Leu Thr Thr Val Asp Leu Leu Lys Arg Phe Asn Ser Tyr 
120 125 130 135 

gat cca aaa aac ata cct gtt aat gee aag gtt aat gtc act gtt aat 4624 
Asp Pro Lys Asn He Pro Val Asn Ala Lys Val Asn Val Thr Val Asn 
140 145 150 

tgt tct tgt ggg aac age cag gtt tea aaa gat tat ggc ttg ttt att 4672 
Cys Ser Cys Gly Asn Ser Gin Val Ser Lys Asp Tyr Gly Leu Phe He 
155 160 165 

ace tat ccc att agg cct ggg gat aca ctg cag gat att gca aac cag 4720 
Thr Tyr Pro He Arg Pro Gly Asp Thr Leu Gin Asp He Ala Asn Gin 
170 175 180 

agt agt ctt gat gca ggg ttg ata cag agt ttc aac cca agt gtc aat 4768 
Ser Ser Leu Asp Ala Gly Leu He Gin Ser Phe Asn Pro Ser Val Asn 
185 190 195 

ttc age aaa gat agt ggg ata get ttc att cct gga aga t gtatgttatc 4818 
Phe Ser Lys Asp Ser Gly He Ala Phe He Pro Gly Arg 
200 205 210 



ctttttgttt taaatttttc 


cgctttgatt 


aaagtttatt 


attattagca 


tgattggatc 


4878 


aacttctctt teatcaaaat 


eatttctgaa 


acteagaagc 


tactcacaca 


agcttcctgg 


4938 


tttcagaate aattgtagta 


gggtttccaa 


aeatgetett 


ttatcaaaat 


caattaegta 


4998 


actcagaaac tactcaeata 


agettctcet 


tagaattgat 


tetgttttta 


gaateaattg 


5058 


taaaagggtt tacaaaeatg 


cactctgcta 


gtgtgtgtgc 


ttaaaactat 


tcatggtgaa 


5118 


attactcttc cattgtttct 


aeaataatac 


atgacaagge 


atgtaactta 


ccccacetaa 


5178 


ttgaaaaatg gttggtggtt 


attgttatat 


catttgttca 


ataeatttga 


tataaacttt 


5238 


tatgaattta cctgaagttt 


tacttttctt 


tgaactttte ag at aaa 
Tyr Lys Asn Gly 


aat gga 


5291 



215 

gtc tat gtt ccc ttg tac cac ag gtgggtaact tcaattgcct actcatcttt 5344 
Val Tyr Val Pro Leu Tyr His Arg 
220 



ttatgatgaa tgatagcatg tttggatcaa cttctctttc accagaatta atccttaaat 5404 

tcagaactaa gaagctactc acataagctt tttcccggaa ttaattctgg cttcagaagc 5464 

aattacactg aaagatttcc aaacatgctc taaatattgt ttcgtgcttg gttctatctt 5524 

tttaactttc atttattttt cctttttcat tttgcag a acc gca g gtttggccct 5579 

Thr Ala 
225 

ctaaattggt tctagggatg attattttta ccttgatgtt cacaaaaata tgagaacaca 5639 

aaaaaagagg atgcctctga gcttagcttt acttctatgt aagcag gt eta get S693 

Gly Leu Ala 

agt ggt gca get gtt ggt ata tct att gca gga acc ttc gtg ctt ctg 5741 
Ser Gly Ala Ala Val Gly lie Ser lie Ala Gly Thr Phe Val Leu Leu 
230 235 240 245 

tta eta gea ttt tgt atg tat gtt aga tae cag aag aag gaa gaa gag 5789 
Leu Leu Ala Phe Cys Met Tyr Val Arg Tyr Gin Lys Lys Glu Glu Glu 
250 255 260 

aaa get aaa ttg cca aca gat att tct atg gee ctt tea aca caa gat g 5838 
Lys Ala Lys Leu Pro Thr Asp He Ser Met Ala Leu Ser Thr Gin Asp 
265 270 275 

gtaatggtat atttccaaat tcatattcct tctaagttct aaccctcttt agtccccctg 5898 

gaaatgggtg aatgttggtg ctctaatttt tcatgtgttt aaatcagttt tatactaaga 5958 

gtctgttgga caacaggttt ttgtttttaa aacagaaaaa gccgaaaatt tgtttgatat 6018 

gaaaagtttt aaggaaattc ttattttttt gatatatcgg aaaattctta ttaagtgttc 6078 

ctgttctcat tttctaaaac taaaatttca aaacatctcg gaggattttt cttcttgttt 6138 

ttagttttca attcacaggt ctttcagttt tgtaagcatc ttgttcaaat atagattttc 6198 

ttttcttctt ttgaaaaaca tgteataaaa ttatttctga aaatagtttt taaatttaga 6258 

ggactgagaa gagaateaaa caagtcctaa tttttacctt ttcctgttta tcatttataa 6318 

acttattacc tgatctaatt tcaggctaca ttttacctga tgttaaaggc agaaaattta 6378 

cctgatccaa atgtttgagt tccattcaat ctggcacatt gatataattt gagaggatat 6438 

gacaaeacta gctaactttt cttcctcttt cttgaag cc tct agt agt gca gaa 6492 

Ala Ser Ser Ser Ala Glu 
280 

tat gaa act tet gga tec agt ggg cca ggg act get agt get aca ggt 6540 
Tyr Glu Thr Ser Gly Ser Ser Gly Pro Gly Thr Ala Ser Ala Thr Gly 
285 290 295 

ctt act age att atg gtg geg aaa tea atg gag ttc tea tat cag gaa 6588 
Leu Thr Ser He Met Val Ala Lys Ser Met Glu Phe Ser Tyr Gin Glu 
300 305 310 315 



eta gcg aag get aca aat aac ttt age ttg gat aat aaa att ggt eaa 6636 
Leu Ala Lys Ala Thr Asn Asn Phe Ser Leu Asp Asn Lys lie GXy Gin 
320 325 330 

99t gga ttt gga get gte tat tat gea gaa ttg aga ggc aag 6678 
Gly Gly Phe Gly Ala Val Tyr Tyr Ala Glu Leu Arg Gly Lys 
335 340 345 

gtagtgaccg tgtgtctctt cagttctata acatagtgca tgtttggata caaagaggaa 6738 

aaccacggtg aagccaaatt tgcggtggac agacacaaaa gctaaaggaa gttgtcacca 6798 

tgattttcaa ttgtgtatcc aaacttgeac aaaagaggat agaagtttct- tacattagag 6858 

tagtagtgaa aagtttaaat tttaaggctt tgtgttcatt gtgaggaage tatataaaac 6918 

aactcaaatc agtttaggge aaaaaattgt ttcattgaaa agaaagataa gagtaatgat 6978 

tttacttaaa tggatattgt tcttaaagag gtggatggga aagtttctgc tttttgtgce 7038 

actttaggtt atccctttaa cttttaactc ttcctggatt tectctaatg caatttattc 7098 

aatgcag aaa aca gca att aag aag atg gat gta caa gca tea aca gaa 7147 
Lys Thr Ala lie Lys Lys Met Asp Val Gin Ala Ser Thr Glu 
350 355 

ttt ctt tgt gag ttg aag gtc tta aca cat gtt cac cae ttg aat ctg 7195 
Phe Leu Cys Glu Leu Lys Val Leu Thr His Val His His Leu Asn Leu 
360 365 370 375 

gtacaacatc cttcaaacaa cttaaagcat tattatatct ttgggaagga aagattaata 7255 

tttttatgtt tagtttgaag aatcattagg ttcttacaaa acaaatatcc tteatggttc 7315 

tgtgaactga atagtcctat agttatccag caaaatttct gcagatccac atgatagtec 7375 

aacatgggat ctgcattact agtgaaagaa cttgtaaaac atttgtaact tcaattttct 7435 

gtccttgaaa gtaacagacc atttagagca cactccccaa cattaatacc aaataaagaa 7495 

gaaaatcagc cctcttcccg catgtgtggt tccactgtga aatatttgaa aatcacttgt 7555 

gattagaagc tacaagtcta agcttctgag caaacgtgtc ttggattttg tgctaatcat 7615 

aaagccaaat atgctattag ttaatgatta aaggcattat tagaaactcc tttatttcca 7675 

attgecactg ttgatatgtt atttggattt ttcaaacagt ttctcctaac aaacaggttc 7735 

agaaaaaaaa ttagtattaa tttctatcta tgattactta aagaagaaag tgctaaattc 7795 

tttctgggat ttcaatataa ctatatcata cacttttcat ttaatttttc taattttgga 7855 

atctttgttt agcataaaca gctctaagta agttataatt cttattctgt atgtacctac 7915 

tttctatgaa caacatag gtg cgc ttg att gga tac tgc gtt gag gga tot 7966 
Val Arg Leu He Gly Tyr Cys Val Glu Gly Ser 
380 385 



eta ttc Gtt gtt tat gaa cat att gac aat gga aac tta ggc caa tat 8014 
Leu Phe Leu Val Tyr Glu His lie Asp Asn Gly Asn Leu Gly Gin Tyr 
390 395 400 

ttg cat ggt tea g gtgagaacag gatgcagtga tatttttttg ctgtgacatt 8067 
Leu His Gly Ser 
405 

atcagcatgt ttggatcaat ttctctttca ccagaattaa ttctgaaaca gagaagtagc 8127 

ttctccacag aattgattct gacttcagag tcaatagtag aattatttcg aaacatgcac 8187 

ggcattatag tcaaacaatt aataatgatg atgacatgat ttcag gt aaa gaa cca 8243 

Gly Lys Glu Pro 
410 

ttg cca tgg tct age cga gta caa ata get eta gat gca gca aga ggc 8291 
Leu Pro Trp Ser Ser Arg Val Gin lie Ala Leu Asp Ala Ala Arg Gly 
415 420 425 

ctt gaa tac att cat gag cac act gtg cct gtg tat ate cat cgc gat 8339 
Leu Glu Tyr He His Glu His Thr Val Pro Val Tyr He His Arg Asp 
430 435 440 

gtg aaa tct gca aac ata ttg ata gat aag aac ttg cgt gga aag 8384 
Val Lys Ser Ala Asn He Leu He Asp Lys Asn Leu Arg Gly Lys 
445 450 455 

gttgcattta ttaccaatct tcatgatcca aattctttca tttettcttt gagactttaa 8444 

tcaaactgtg aaagttttta tgttcag gtt gca gat ttt ggc ttg acc aag ctt 8498 

Val Ala Asp Phe Gly Leu Thr Lys Leu 
460 465 

att gaa gtt ggg aac tec aca eta caa act cgt ctg gtg gga aca ttt 8546 
He Glu Val Gly Asn Ser Thr Leu Gin Thr Arg Leu Val Gly Thr Phe 
470 475 480 

gga tac atg ccc cca ga gtatgatttt cttttgatgt tgtattaatg 8593 
Gly Tyr Met Pro Pro Asp . 
485 

gtgtttttgg ataaacagtt taatcaaaag ttgatggtaa taaacaccta tcgcataagt 8653 

gtttattcat aaactatttt gaga tgt tta ttgagataaa gttaaaatat ctaatgagtt 8713 

tagtgactta tgaaagtaag ctctcaacaa cttataagta gggtataagg tatttacaat 8773 

acataagctc taacaagcac ttagatacac acatttgagc ttatctttca caataaatgc 8833 

tcgtacaagt gtttgagaga gcttgtgtag cttatgcgct acctagaagc tgatttgagc 8893 

ttattttcac aagttgttca tattagctta tgaataagag attatgctta tatataattt 8953 

attttcagct tatttcaata agttcatcaa atttgcttat gaataagtgc ttgtgcgaca 9013 

agcgcttatt gctacaagtg cttaattacg ctgtttaccc ataaacgtgt tcaattagta 9073 

aagtcaagtt cagttttcaa aacatatcat tgagtgaact tgttttacct ggcttttatg 9133 



caga t atg etc aat atg gtg ata ttt etc caa aaa tag atg tat atg 9180 
Met Leu Asn Met Val lie Phe Leu Gin Lys Met Tyr Met 
490 495 500 

cat ttg gag ttg ttc ttt ttg aac tta ttt ctg caa aga atg ctg ttc 9228 
His Leu Glu Leu Phe Phe Leu Asn Leu Phe Leu Gin Arg Met Leu Phe 
505 510 515 

tga aga cag gtg aat tag ttg ctg aat caa agg gcc ttg tag ctt tg 9275 
Arg Gin Val Asn Leu Leu Asn Gin Arg Ala Leu Leu Cys 
520 525 

gtgagtctac atgccccttc tctaacctta tttacaaacc aattactcac aatttcgaaa 9335 

attttacatg tatatttcaa agctactcag cacaaatgca tttgccctta acttgctttg 9395 

cattgcag t ttg aag aag cac tta ata aga gtg ate ctt gtg atg etc 9443 
Leu Lys Lys His Leu lie Arg Val lie Leu Val Met Leu 
535 540 

ttc gca aac tgg tgg ate eta ggc ttg gag aaa act ate caa ttg att 9491 
Phe Ala Asn Trp Trp He Leu Gly Leu Glu Lys Thr He Gin Leu He 
545 550 555 

ctg ttc tea ag gtgggagcaa ttctcactaa aattaatttg aaatgaatta 9542 

Leu Phe Ser Arg 

560 

ctatcattta gtcacttgaa tgactttttt tatcagaaca taagcaggtt gtgtctagtt 9602 

ttcttttggt gggtttagga cttaaagtta tcttagtgta aaattttcte attttactaa 9662 

accttaatgc tttattgttg tttgagttgc ag a ttg cac aac tag gga gag ctt 9716 

Leu His Asn Gly Glu Leu 
565 

gta caa gag ata ate cac tgc taa gac caa gta tga gat ctt tag ttg 9764 
Val Gin Glu lie He His Cys Asp Gin Val Asp Leu Leu 
570 575 580 

ttg etc tta tga ccc ttt cat cac tta ctg agg att gtg atg atg aat 9812 
Leu Leu Leu Pro Phe His His Leu Leu Arg He Val Met Met Asn 
585 590 595 

ctt cet acg aaa gtc aaa etc tea taa att tac tgt ctg tga gat aa 9859 
Leu Pro Thr Lys Val Lys Leu Ser He Tyr Cys Leu Asp 
600 605 610 

aggttctcea tgcaaatgca tgtttgttat atatatettg tagtacaact aagcagacaa 9919 

aaagttttgt actttgaatg taaatcgagt cagggtgttt acattttatt actccaatgt 9979 

ttaattgcca aaaccatcaa aaagtcctag gecagacttc ctgtaattat atttagcaaa 10039 

gttgcagatt ctaagttcag tttttttata tataggtttc agtatttttt atatatatta 10099 

ttttataaat tttttaactt gttacaatat aaacatattt gcattcatet tcaaatcttt 10159 



cagaatcact tctcctacca cagaagctaa tagaagtgtc ttccagaatc aattcttcat 10219 
ccactgtgaa aatctactat gtatcaaagc atgc 10253 



<210> 24 
<211> 621 
<212> PRT 

<213> Lotus japonicus Gifu 
<400> 24 

Met Lys Leu Lys Thr Gly Leu Leu Leu Phe Phe lie Leu Leu Leu Gly 
15 10 15 



His Val Cys Phe His Val Glu Ser Asn Cys Leu Lys Gly Cys Asp Leu 
20 25 . 30 



Ala Leu Ala Ser Tyr Tyr lie Leu Pro Gly Val Phe lie Leu Gin Asn 
35 40 45 



He Thr Thr Phe Met Gin Ser Glu He Val Ser Ser Asn Asp Ala He 
50 55 60 



Thr Ser Tyr Asn Lys Asp Lys He Leu Asn Asp He Asn He Gin Ser 
65 70 75 80 



Phe Gin Arg Leu Asn He Pro Phe Pro Cys Asp Cys He Gly Gly Glu 
85 90 95 



Phe Leu Gly His Val Phe Glu Tyr Ser Ala Ser Lys Gly Asp Thr Tyr 
100 105 110 



Glu Thr He Ala Asn Leu Tyr Tyr Ala Asn Leu Thr Thr Val Asp Leu 
115 120 125 



Leu Lys Arg Phe Asn Ser Tyr Asp Pro Lys Asn He Pro Val Asn Ala 
130 135 140 



Lys Val Asn Val Thr Val Asn Cys Ser Cys Gly Asn Ser Gin Val Ser 
145 150 155 160 



Lys Asp Tyr Gly Leu Phe He Thr Tyr Pro He Arg Pro Gly Asp Thr 
165 170 175 



Leu Gin Asp He Ala Asn Gin Ser Ser Leu Asp Ala Gly Leu He Gin 
180 185 190 



Ser Phe Asm Pro Ser Val Asn Phe Ser Lys Asp Ser Gly He Ala Phe 
195 200 205 



He Pro Gly Arg Tyr Lys Asn Gly Val Tyr Val Pro Leu Tyr His Arg 
210 215 220 



Thr Ala Gly Leu Ala Ser Gly Ala Ala Val Gly He Ser He Ala Gly 
225 230 235 240 



Thr Phe Val Leu Leu Leu Leu Ala Phe Cys Met Tyr Val Arg Tyr Gin 
245 250 255 



Lys Lys Glu Glu Glu Lys Ala Lys Leu Pro Thr Asp He Ser Met Ala 
260 265 270 



Leu Ser Thr Gin Asp Ala Ser Ser Ser Ala Glu Tyr Glu Thr Ser Gly 
275 280 285 



Ser Ser Gly Pro Gly Thr Ala Ser Ala Thr Gly Leu Thr Ser He Met 
290 295 300 



Val Ala Lys Ser Met Glu Phe Ser Tyr Gin Glu Leu Ala Lys Ala Thr 
305 310 315 320 



Asn Asn Phe Ser Leu Asp Asn Lys He Gly Gin Gly Gly Phe Gly Ala 
325 330 335 



Val Tyr Tyr Ala Glu Leu Arg Gly Lys Lys Thr Ala He Lys Lys Met 
340 345 350 



Asp Val Gin Ala Ser Thr Glu Phe Leu Cys Glu Leu Lys Val Leu Thr 
355 360 365 



His Val His His Leu Asn Leu Val Arg Leu He Gly Tyr Cys Val Glu 
370 375 380 



Gly Ser Leu Phe Leu Val Tyr Glu His He Asp Asn Gly Asn Leu Gly 
385 390 395 400 



Gin Tyr Leu His Gly Ser Gly Lys Glu Pro Leu Pro Trp Ser Ser Arg 
405 410 415 



Val Gin He Ala Leu Asp Ala Ala Arg Gly Leu Glu Tyr He His Glu 
420 425 430 



His Thr Val Pro Val Tyr He His Arg Asp Val Lys Ser Ala Asn He 
435 440 445 



Leu He Asp Lys Asn Leu Arg Gly Lys Val Ala Asp Phe Gly Leu Thr 
450 455 460 



Lys Leu He Glu Val Gly Asn Ser Thr Leu Gin Thr Arg Leu Val Gly 
465 470 475 480 



Thr Phe Gly Tyr Met Pro Pro Glu Tyr Ala Gin Tyr Gly Asp He Ser 
485 490 495 



Pro Lys He Asp Val Tyr Ala Phe Gly Val Val Leu Phe Glu Leu He 
500 505 510 



Ser Ala Lys Asn Ala Val Leu Lys Thr Gly Glu Leu Val Ala Glu Ser 
515 520 525 



Lys Gly Leu Val Ala Leu Phe Glu Glu Ala Leu Asn Lys Ser Asp Pro 
530 535 540 



Cys Asp Ala Leu Arg Lys Leu Val Asp Pro Arg Leu Gly Glu Asn Tyr 
545 550 555 560 



Pro He Asp Ser Val Leu Lys He Ala Gin Leu Gly Arg Ala Cys Thr 
565 570 575 



Arg Asp Asn Pro Leu Leu Arg Pro Ser Met Arg Ser Leu Val Val Ala 
580 585 590 



Leu Met Thr Leu Ser Ser Leu Thr Glu Asp Cys Asp Asp Glu Ser Ser 
595 600 605 



Tyr Glu Ser Gin Thr Leu He Asn Leu Leu Ser Val Arg 
610 615 620 



<210> 25 

<211> 623 

<212> PRT 

<213> Lotus japonicus Gifu 

<400> 25 



Met Lys Leu Lys Thr Gly Leu Leu Leu Phe Phe He Leu Leu Leu Gly 
15 10 15 



His Val Cys Phe His Val Glu Sex Asn Cys Leu Lys Gly Cys Asp Leu 
20 25 30 



Ala Leu Ala Ser Tyr Tyr lie Leu Pro Gly Val Phe He Leu Gin Asn 
35 40 45 



He Thr Thr Phe Met Gin Ser Glu He Val Ser Ser Asn Asp Ala He 
50 55 60 



Thr Ser Tyr Asn Lys Asp Lys He Leu Asn Asp He Asn He Gin Ser 
65 70 75 80 



Phe Gin Arg Leu Asn He Pro Phe Pro Cys Asp Cys He Gly Gly Glu 
85 90 95 



Phe Leu Gly His Val Phe Glu Tyr Ser Ala Ser Lys Gly Asp Thr Tyr 
100 105 110 



Glu Thr He Ala Asn Leu Tyr Tyr Ala Asn Leu Thr Thr Val Asp Leu 
115 120 125 



Leu Lys Arg Phe Asn Ser Tyr Asp Pro Lys Asn He Pro Val Asn Ala 
130 135 140 



Lys Val Asn Val Thr Val Asn Cys Ser Cys Gly Asn Ser Gin Val Ser 
145 150 155 160 



Lys Asp Tyr Gly Leu Phe He Thr Tyr Pro He Arg Pro Gly Asp Thr 
165 170 175 



Leu Gin Asp He Ala Asn Gin Ser Ser Leu Asp Ala Gly Leu He Gin 
180 185 190 



Ser Phe Asn Pro Ser Val Asn Phe Ser Lys Asp Ser Gly He Ala Phe 
195 200 205 



He Pro Gly Arg Tyr Lys Asn Gly Val Tyr Val Pro Leu Tyr His Arg 
210 215 220 



Thr Ala Gly Leu Ala Ser Gly Ala Ala Val Gly He Ser He Ala Gly 
225 230 235 240 



Thr Phe Val Leu Leu Leu Leu Ala Phe Cys Met Tyr Val Arg Tyr Gin 



245 



250 



255 



Lys Lys Glu Glu Glu Lys Ala Lys Leu Pro Thr Asp He Ser Met Ala 
260 265 270 



Leu Ser Thr Gin Asp Ala Gly Asn Ser Ser Ser Ala Glu Tyr Glu Thr 
275 280 285 



Ser Gly Ser Ser Gly Pro Gly Thr Ala Ser Ala Thr Gly Leu Thr Ser 
290 295 300 



He Met Val Ala Lys Ser Met Glu Phe Ser Tyr Gin Glu Leu Ala Lys 
305 310 315 320 



Ala Thr Asn Asn Phe Ser Leu Asp Asn Lys He Gly Gin Gly Gly Phe 
325 330 335 



Gly Ala Val Tyr Tyr Ala Glu Leu Arg Gly Lys Lys Thr Ala He Lys 
340 345 350 



Lys Met Asp Val Gin Ala Ser Thr Glu Phe Leu Cys Glu Leu Lys Val 
355 360 365 



Leu Thr His Val His His Leu Asn Leu Val Arg Leu He Gly Tyr Cys 
370 375 380 



Val Glu Gly Ser Leu Phe Leu Val Tyr Glu His He Asp Asn Gly Asn 
385 390 395 400 



Leu Gly Gin Tyr Leu His Gly Ser Gly Lys Glu Pro Leu Pro Trp Ser 
405 410 415 



Ser Arg Val Gin He Ala Leu Asp Ala Ala Arg Gly Leu Glu Tyr He 
420 425 430 



His Glu His Thr Val Pro Val Tyr He His Arg Asp Val Lys Ser Ala 
435 440 445 



Asn He Leu He Asp Lys Asn Leu Arg Gly Lys Val Ala Asp Phe Gly 
450 455 460 



Leu Thr Lys Leu He Glu Val Gly Asn Ser Thr Leu Gin Thr Arg Leu 
465 470 475 480 



Val Gly Thr Phe Gly Tyr Met Pro Pro Glu Tyr Ala Gin Tyr Gly Asp 
485 490 495 



He Ser Pro Lys He Asp Val Tyr Ala Phe Gly Val Val Leu Phe Glu 
500 505 510 



Leu He Ser Ala Lys Asn Ala Val Leu Lys Thr Gly Glu Leu Val Ala 
515 520 525 



Glu Ser Lys Gly Leu Val Ala Leu Phe Glu Glu Ala Leu Asn Lys Ser 
530 535 540 



Asp Pro Cys Asp Ala Leu Arg Lys Leu Val Asp Pro Arg Leu Gly Glu 
545 550 555 560 



Asn Tyr Pro He Asp Ser Val Leu Lys He Ala Gin Leu Gly Arg Ala 
565 570 575 



Cys Thr Arg Asp Asn Pro Leu Leu Arg Pro Ser Met Arg Ser Leu Val 
580 585 590 



Val Ala Leu Met Thr Leu Ser Ser Leu Thr Glu Asp Cys Asp Asp Glu 
595 600 605 



Ser Ser Tyr Glu Ser Gin Thr Leu He Asn Leu Leu Ser Val Arg 
610 615 620 



<210> 26 
<211> 19 
<212> DNA 

<213> Lotus japonicus 
<400> 26 

aatgctcttg atcaggctg 1^ 



<210> 27 
<211> 20 
<212> DNA 

<2X3> Lotus japonicus 
<400> 27 

aggagcccaa gtgagtgcta 20 



<210> 28 

<211> 20 

<212> DNA 

<213> Lotus japonicus 



<400> 28 

caggaaaaac caccacctgt 20 

<210> 29 
<211> 21 
<212> DMA 

<213> Lotus japonicus 
<400> 29 

atggaggcga atacactggt g 21 

<210> 30 
<211> 1853 
<212> DNA 

<213> Lotus filicaulis 
<400> 30 

ttttctcttt ccctgttaac tatcatttgt tcccaacttc acaaacatgg ctgtcttctt 60 

tcttacctct ggctctctga gtctttttct tgcactcacg ttgcttttca ctaacatcgc 120 

cgctcgatca gaacagatca gcggcccaga cttttcatgc cctgttgact cacctccttc 180 

ttgtgaaaca tatgtgacat acacagctca gtctccaaat cttctgagcc tgacaaacdt 240 

atctgatata tttgatatca gtcctttgtc cattgcaaga gccagtaaca tagatgcagg 300 

gaaggacaag ctggttccag gccaagtctt actggtacct gtaacttgcg gttgcgccgg 360 

aaaccactct tctgccaata cctcctacca aatccagaaa ggtgatagct acgactttgt 420 

tgcaaccact ttatatgaga accttacaaa ttggaatata gtacaagctt caaacccagg 4 80 

ggtaaatcca tatttgttgc cagagcgcgt caaagtcgta ttccctttat tctgcaggtg 540 

cccttcaaag aaccagttga acaaagggat tcagtatctg attacttatg tgtggaagcc 600 

caatgacaat gtttcccttg tgagtgccaa gtttggtgca tccccagcgg acatattgac 660 

tgaaaaccgc tacggtcaag acttcactgc tgcaaccaac cttccaattt tgatcccagt 720 

gacacagttg ccaaagctta ctcaaccttc ttcaaatgga aggaaaagca gcattcatct 780 

tctggttata cttggtatta ccctgggatg cacgttgcta actgcagttt taaccgggac 840 

cctcgtatat gtatactgcc gcagaaagaa ggctctgaat aggactgctt catcagctga 900 

gactgctgat aaactacttt ctggagtttc aggctatgta agcaagccaa acgtgtatga 960 

aatcgacgag ataatggaag ctacgaagga tttcagcgat gagtgcaagg ttggggaatc 1020 

agtgtacaag gccaacatag aaggtcgggt tgtagcggta aagaaaatca aggaaggtgg 1080 

tgccaatgag gaactgaaaa ttctgcagaa ggtaaatcat ggaaatctgg tgaaactaat 1140 

gggtgtctcc tcaggctatg atggaaactg tttcttggtt tatgaatatg ctgaaaatgg 1200 



gtctcttgct 


aaatacctat 




u {.caggaacc 


ccaaacuccc 


btacatggtc 


1260 


tcaaaggata 


aacataocacf 

«>A «4 W« W O %4 ^A^j 




f~ M 1* MMM W M ft* 

tycgggtccg 


caatacaCgc 


atgaacatac 


1320 


ctatccaaga 






aacaagcaat: 


atccttctcg 


actcgacctt 


1380 


caaggccaag 


ataocaaatit 


^ 


^ St MA A M ^ 4» .MM 

cagaaccccg 


accaacccca 


tgatgccaaa 


1440 


aatcoatatc 




n {■ MM 4* 4* M 4* 

ggg^gcc ucc 


gatagagttg 


ctcaccggaa 


ggaaagccat 


1500 


Qacaaccaaa 




aggcggccan 


gctgtggaag 


gatatgtggg 


agatctttga 


1560 


catagaagag 


aatagagagg 


agaggatcag 


aaaatggatg 


gatcctaatt 


tagagagctt 


1620 


ttatcatata 


gataatgctc 


tcagcttggc 


atccttagca 


gtgaattgca 


cagctgataa 


1680 


gtctttgtct 


cgaccctcca 


tggctgaaat 


tgttcttagc 


ctctcctttc 


tcactcaaca 


1740 


atcatctaac 


cccacattag 


agagatcctt 


gacttcttct 


gggttagatg 


tagaagatga 


1800 


tgctcatatt 


gtcacttcca 


ttacagcacg 


ttaagcaagg 


gaaggtaatt 


cag 


1853 



<210> 31 
<211> 595 
<212> PRT 

<213> Lotus fllicaulis 
<400> 31 

Met Ala Val Phe Phe Leu Thr Ser Gly Ser Leu Ser Leu Phe Leu Ala 
^5 10 15 

Leu Thr Leu Leu Phe Thr Asn lie Ala Ala Arg Ser Glu Gin He Ser 
20 25 30 

Gly Pro Asp Phe Ser Cys Pro Val Asp Ser Pro Pro Ser Cys Glu Thr 
35 40 45 

Tyr Val Thr Tyr Thr Ala Gin Ser Pro Asn Leu Leu Ser Leu Thr Asn 
50 55 60 

lie Ser Asp He Phe Asp He Ser Pro Leu Ser He Ala Arg Ala Ser 
70 75 80 

Asn He Asp Ala Gly Lys Asp Lys Leu Val Pro Gly Gin Val Leu Leu 
85 90 95 

Val Pro Val Thr Cys Gly Cys Ala Gly Asn His Ser Ser Ala Asn Thr 
100 105 110 



Tyr Gin He Gin Lys Gly Asp Ser Tyr Asp Phe Val Ala Thr Thr 



115 



120 



125 



Leu Tyr Glu Asn Leu Thr Asn Trp Asn lie Val Gin Ala Ser Asn Pro 
130 135 140 

Gly Val Asn Pro Tyr Leu Leu Pro Glu Arg Val Lys Val Val Phe Pro 
145 150 155 160 

Leu Phe Cys Arg Cys Pro Ser Lys Asn Gin Leu Asn Lys Gly He Gin 
165 170 175 

Tyr Leu He Thr Tyr Val Trp Lys Pro Asn Asp Asn Val Ser Leu Val 
180 185 190 



Ser Ala Lys Phe Gly Ala Ser Pro Ala Asp He Leu Thr Glu Asn Arg 
195 200 205 



Tyr Gly Gin Asp Phe Thr Ala Ala Thr Asn Leu Pro He Leu He Pro 
210 215 220 

Val Thr Gin Leu Pro Lys Leu Thr Gin Pro Ser Ser Asn Gly Arg Lys 
225 230 235 240 

Ser Ser He His Leu Leu Val He Leu Gly He Thr Leu Gly Cys Thr 
245 250 255 

Leu Leu Thr Ala Val Leu Thr Gly Thr Leu Val Tyr Val Tyr Cys Arg 
260 265 270 

Arg Lys Lys Ala Leu Asn Arg Thr Ala Ser Ser Ala Glu Thr Ala Asp 
275 280 285 

Lys Leu Leu Ser Gly Val Ser Gly Tyr Val Ser Lys Pro Asn Val Tyr 
290 295 300 

Glu He Asp Glu He Met Glu Ala Thr Lys Asp Phe Ser Asp Glu Cys 
303 310 315 320 



Lys Val Gly Glu Ser Val Tyr Lys Ala Asn He Glu Gly Arg Val Val 
325 330 335 



Ala Val Lys Lys He Lys Glu Gly Gly Ala Asn Glu Glu Leu Lys He 
340 345 350 



Leu Gin Lys Val Asn His Gly Asn Leu Val Lys Leu Met Gly Val Ser 
355 360 365 



Ser Gly Tyr Asp Gly Asn Cys Phe Leu Val Tyr Glu Tyr Ala Glu Asn 
370 375 380 



Gly Ser Leu Ala Glu Trp Leu Phe Ser Lys Ser Ser Gly Thr Pro Asn 
385 390 395 400 



Ser Leu Thr Trp Ser Gin Arg lie Ser lie Ala Val Asp Val Ala Val 
405 410 415 



Gly Leu Gin Tyr Met His Glu His Thr Tyr Pro Arg He He His Arg 
420 425 430 



Asp He Thr Thr Ser Asn He Leu Leu Asp Ser Thr Phe Lys Ala Lys 
435 440 445 



He Ala Asn Phe Ala Met Ala Arg Thr Ser Thr Asn Pro Met Met Pro 
450 455 460 



Lys He Asp Val Phe Ala Phe Gly Val Leu Leu He Glu Leu Leu Thr 
465 470 475 480 



Gly Arg Lys Ala Met Thr Thr Lys Glu Asn Gly Glu Val Val Met Leu 
485 490 495 



Trp Lys Asp Met Trp Glu He Phe Asp He Glu Glu Asn Arg Glu Glu 
500 505 510 



Arg He Arg Lys Trp Met Asp Pro Asn Leu Glu Ser Phe Tyr His He 
515 520 525 



Asp Asn Ala Leu Ser Leu Ala Ser Leu Ala Val Asn Cys Thr Ala Asp 
530 535 540 



Lys Ser Leu Ser Arg Pro Ser Met Ala Glu He Val Leu Ser Leu Ser 
545 550 555 560 



Phe Leu Thr Gin Gin Ser Ser Asn Pro Thr Leu Glu Arg Ser Leu Thr 
565 570 575 



Ser Ser Gly Leu Asp Val Glu Asp Asp Ala His He Val Thr Ser He 
580 585 590 



595 
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